Top Banner
15-441 Computer Networking Lecture 16 –TCP in detail Eric Anderson Fall 2013 www.cs.cmu.edu/~prs/15-441-F13
52

15-441 Computer Networking Lecture 16 –TCP in detail Eric Anderson Fall 2013 prs/15-441-F13.

Jan 17, 2016

Download

Documents

Cornelius May
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: 15-441 Computer Networking Lecture 16 –TCP in detail Eric Anderson Fall 2013 prs/15-441-F13.

15-441 Computer Networking

Lecture 16 –TCP in detail

Eric Anderson

Fall 2013

www.cs.cmu.edu/~prs/15-441-F13

Page 2: 15-441 Computer Networking Lecture 16 –TCP in detail Eric Anderson Fall 2013 prs/15-441-F13.

2

Good Ideas So Far…

• Flow control• Stop & wait• Sliding window

• Loss recovery• Timeouts• Acknowledgement-driven recovery

• Selective repeat• Cumulative acknowledgement

• Congestion control• AIMD fairness and efficiency

• How does TCP actually implement these?

Page 3: 15-441 Computer Networking Lecture 16 –TCP in detail Eric Anderson Fall 2013 prs/15-441-F13.

3

Outline

• TCP connection setup/data transfer

• TCP Packet Loss and Retransmission

• TCP congestion avoidance

• TCP slow start

Page 4: 15-441 Computer Networking Lecture 16 –TCP in detail Eric Anderson Fall 2013 prs/15-441-F13.

4

Sequence Number Space

• Each byte in byte stream is numbered.• 32 bit value• Wraps around• Initial values selected at start up time

• TCP breaks up the byte stream into packets.• Packet size is limited to the Maximum Segment Size

• Each packet has a sequence number.• Indicates where it fits in the byte stream

packet 8 packet 9 packet 10

13450 14950 16050 17550

Page 5: 15-441 Computer Networking Lecture 16 –TCP in detail Eric Anderson Fall 2013 prs/15-441-F13.

5

Establishing Connection:Three-Way handshake

• Each side notifies other of starting sequence number it will use for sending• Why not simply chose 0?

• Must avoid overlap with earlier incarnation

• Security issues

• Each side acknowledges other’s sequence number• SYN-ACK: Acknowledge

sequence number + 1

• Can combine second SYN with first ACK

SYN: SeqC

ACK: SeqC+1SYN: SeqS

ACK: SeqS+1

Client Server

Page 6: 15-441 Computer Networking Lecture 16 –TCP in detail Eric Anderson Fall 2013 prs/15-441-F13.

6

TCP Connection Setup Example

• Client SYN• SeqC: Seq. #4019802004, window 65535, max. seg. 1260

• Server SYN-ACK+SYN• Receive: #4019802005 (= SeqC+1)• SeqS: Seq. #3428951569, window 5840, max. seg. 1460

• Client SYN-ACK• Receive: #3428951570 (= SeqS+1)

09:23:33.042318 IP 128.2.222.198.3123 > 192.216.219.96.80: S 4019802004:4019802004(0) win 65535 <mss 1260,nop,nop,sackOK> (DF)

09:23:33.118329 IP 192.216.219.96.80 > 128.2.222.198.3123: S 3428951569:3428951569(0) ack 4019802005 win 5840 <mss 1460,nop,nop,sackOK> (DF)

09:23:33.118405 IP 128.2.222.198.3123 > 192.216.219.96.80: . ack 3428951570 win 65535 (DF)

Page 7: 15-441 Computer Networking Lecture 16 –TCP in detail Eric Anderson Fall 2013 prs/15-441-F13.

8

CLOSED

LISTEN

SYN_RCVD SYN_SENT

ESTABLISHED

CLOSE_WAIT

LAST_ACKCLOSING

TIME_WAIT

FIN_WAIT_2

FIN_WAIT_1

Passive open Close

Send/SYNSYN/SYN + ACK

SYN + ACK/ACK

SYN/SYN + ACK

ACK

Close/FIN

FIN/ACKClose/FIN

FIN/ACKACK + FIN/ACK Timeout after two segment lifetimes

FIN/ACKACK

ACK

ACK

Close/FIN

Close

CLOSED

Active open/SYN

TCP State Diagram: Connection Setup

s

c

Page 8: 15-441 Computer Networking Lecture 16 –TCP in detail Eric Anderson Fall 2013 prs/15-441-F13.

9

Tearing Down Connection

• Either side can initiate tear down• Send FIN signal• “I’m not going to send any more

data”

• Other side can continue sending data• Half open connection• Must continue to acknowledge

• Acknowledging FIN• Acknowledge last sequence

number + 1

A BFIN, SeqA

ACK, SeqA+1

ACK

Data

ACK, SeqB+1

FIN, SeqB

Page 9: 15-441 Computer Networking Lecture 16 –TCP in detail Eric Anderson Fall 2013 prs/15-441-F13.

10

TCP Connection Teardown Example

• Session• Echo client on 128.2.222.198, server on 128.2.210.194

• Client FIN• SeqC: 1489294581

• Server ACK + FIN• Ack: 1489294582 (= SeqC+1)• SeqS: 1909787689

• Client ACK• Ack: 1909787690 (= SeqS+1)

09:54:17.585396 IP 128.2.222.198.4474 > 128.2.210.194.6616: F 1489294581:1489294581(0) ack 1909787689 win 65434 (DF)

09:54:17.585732 IP 128.2.210.194.6616 > 128.2.222.198.4474: F 1909787689:1909787689(0) ack 1489294582 win 5840 (DF)

09:54:17.585764 IP 128.2.222.198.4474 > 128.2.210.194.6616: . ack 1909787690 win 65434 (DF)

Page 10: 15-441 Computer Networking Lecture 16 –TCP in detail Eric Anderson Fall 2013 prs/15-441-F13.

12

CLOSED

LISTEN

SYN_RCVD SYN_SENT

ESTABLISHED

CLOSE_WAIT

LAST_ACKCLOSING

TIME_WAIT

FIN_WAIT_2

FIN_WAIT_1

Passive open Close

Send/SYNSYN/SYN + ACK

SYN + ACK/ACK

SYN/SYN + ACK

ACK

Close/FIN

FIN/ACKClose/FIN

FIN/ACKACK + FIN/ACK Timeout after two segment lifetimes

FIN/ACKACK

ACK/

/ACK

Close/FIN

Close

CLOSED

Active open/SYN

TCP State Diagram: Connection Teardown

A B

“half-closed”B→A still open

Page 11: 15-441 Computer Networking Lecture 16 –TCP in detail Eric Anderson Fall 2013 prs/15-441-F13.

13

Outline

• TCP connection setup/data transfer

• Packet Loss and Retransmission• Recognizing packet loss• Identifying missing packets• Retransmission behavior

• TCP congestion avoidance

• TCP slow start

Page 12: 15-441 Computer Networking Lecture 16 –TCP in detail Eric Anderson Fall 2013 prs/15-441-F13.

14

Reliability Challenges

• Congestion related losses• Variable packet delays

• What should the timeout be?

• Reordering of packets• How to tell the difference between a delayed packet

and a lost one?

Page 13: 15-441 Computer Networking Lecture 16 –TCP in detail Eric Anderson Fall 2013 prs/15-441-F13.

15

TCP = Go-Back-N Variant

• Sliding window with cumulative acks• Receiver can only return a single “ack” sequence number to the

sender.• Acknowledges all bytes with a lower sequence number• Starting point for retransmission• Duplicate acks sent when out-of-order packet received

• But: sender only retransmits a single packet.• Reason???

• Only one that it knows is lost• Network is congested shouldn’t overload it

• Error control is based on byte sequences, not packets.• Retransmitted packet can be different from the original lost packet

– Why?

Page 14: 15-441 Computer Networking Lecture 16 –TCP in detail Eric Anderson Fall 2013 prs/15-441-F13.

16

Outline

• TCP connection setup/data transfer

• Packet Loss and Retransmission• Recognizing packet loss• Identifying missing packets• Retransmission behavior

• TCP congestion avoidance

• TCP slow start

Page 15: 15-441 Computer Networking Lecture 16 –TCP in detail Eric Anderson Fall 2013 prs/15-441-F13.

Retransmit Timeout

• How long is too long?

• Well, how long does it usually take?

Early TCP: RTO = 2 x RTT

Last 20 years: RTO = RTT + 4x deviation

• What’s the RTT? What’s the deviation?

17

Page 16: 15-441 Computer Networking Lecture 16 –TCP in detail Eric Anderson Fall 2013 prs/15-441-F13.

18

Round-trip Time Estimation

• Wait at least one RTT before retransmitting• Importance of accurate RTT estimators:

• Low RTT estimate• unneeded retransmissions

• High RTT estimate• poor throughput

• RTT estimator must adapt to change in RTT• But not too fast, or too slow!

• Spurious timeouts• “Conservation of packets” principle – never more than a

window worth of packets in flight

Page 17: 15-441 Computer Networking Lecture 16 –TCP in detail Eric Anderson Fall 2013 prs/15-441-F13.

19

Original TCP Round-trip Estimator

• Round trip times exponentially averaged:• New RTT = a (old RTT)

+ (1 - a) (new sample)• Recommended value

for a: 0.8 - 0.9• 0.875 for most TCP’s

0

0.5

1

1.5

2

2.5

• Retransmit timer set to (b * RTT), where b = 2• Every time timer expires, RTO exponentially backed-off

• Not good at preventing spurious timeouts• Why?

Page 18: 15-441 Computer Networking Lecture 16 –TCP in detail Eric Anderson Fall 2013 prs/15-441-F13.

20

Jacobson’s Retransmission Timeout

• Key observation:• At high loads, round trip variance is high

• Solution:• Base RTO on RTT and standard deviation

• RTO = RTT + 4 * rttvar

• new_rttvar = b * dev + (1- b) old_rttvar• Dev = linear deviation • Inappropriately named – actually smoothed linear

deviation

Page 19: 15-441 Computer Networking Lecture 16 –TCP in detail Eric Anderson Fall 2013 prs/15-441-F13.

21

RTT Sample Ambiguity

• Karn’s RTT Estimator• If a segment has been retransmitted:

• Don’t count RTT sample on ACKs for this segment• Keep backed off time-out for next packet• Reuse RTT estimate only after one successful transmission

A B

ACK

SampleRTT

Original transmission

retransmission

RTO

A B

Original transmission

retransmissionSampleRTT

ACKRTOX

Page 20: 15-441 Computer Networking Lecture 16 –TCP in detail Eric Anderson Fall 2013 prs/15-441-F13.

22

Timestamp Extension

• Used to improve timeout mechanism by more accurate measurement of RTT

• When sending a packet, insert current time into option• 4 bytes for time, 4 bytes for echo a received timestamp

• Receiver echoes timestamp in ACK• Actually will echo whatever is in timestamp

• Removes retransmission ambiguity• Can get RTT sample on any packet

Page 21: 15-441 Computer Networking Lecture 16 –TCP in detail Eric Anderson Fall 2013 prs/15-441-F13.

23

Timer Granularity

• Many TCP implementations set RTO in multiples of 200,500,1000ms

• Why?• Avoid spurious timeouts – RTTs can vary quickly due to

cross traffic• Make timers interrupts efficient

• What happens for the first couple of packets?• Pick a very conservative value (seconds)

Page 22: 15-441 Computer Networking Lecture 16 –TCP in detail Eric Anderson Fall 2013 prs/15-441-F13.

ACKs & NACKs

• TCP has no NACK

24

…ACK 12ACK 13ACK 14ACK 14ACK 14ACK 14

…Send 12Send 13Send 14Send 15Send 16Send 17Send 18Send 19…

Page 23: 15-441 Computer Networking Lecture 16 –TCP in detail Eric Anderson Fall 2013 prs/15-441-F13.

25

Duplicate ACKs (Fast Retransmit)

• What are duplicate acks (dupacks)?• Repeated acks for the same sequence

• When can duplicate acks occur?• Loss• Packet re-ordering• Window update – advertisement of new flow control

window• Assume re-ordering is infrequent and not of large

magnitude• Receipt of 3 or more duplicate acks is indication of loss• Don’t wait for timeout to retransmit packet• When does this fail?

Page 24: 15-441 Computer Networking Lecture 16 –TCP in detail Eric Anderson Fall 2013 prs/15-441-F13.

26

Duplicate ACKs (Fast Retransmit)

Time

Sequence No Duplicate Acks

RetransmissionX

Packets

Acks

Page 25: 15-441 Computer Networking Lecture 16 –TCP in detail Eric Anderson Fall 2013 prs/15-441-F13.

27

Outline

• TCP connection setup/data transfer

• Packet Loss and Retransmission• Recognizing packet loss• Identifying missing packets• Retransmission behavior

• TCP congestion avoidance

• TCP slow start

Page 26: 15-441 Computer Networking Lecture 16 –TCP in detail Eric Anderson Fall 2013 prs/15-441-F13.

28

TCP (Reno variant)

Time

Sequence NoX

X

XX

Now what? - timeout

Packets

Acks

Page 27: 15-441 Computer Networking Lecture 16 –TCP in detail Eric Anderson Fall 2013 prs/15-441-F13.

30

Selective ACK (SACK )

Time

Sequence NoX

X

XX

Packets

Acks

“Hole”

Page 28: 15-441 Computer Networking Lecture 16 –TCP in detail Eric Anderson Fall 2013 prs/15-441-F13.

31

“Partial Progress ACK”

Time

Sequence NoX

X

XX

Packets

Acks

“Hole”

Page 29: 15-441 Computer Networking Lecture 16 –TCP in detail Eric Anderson Fall 2013 prs/15-441-F13.

32

Outline

• TCP connection setup/data transfer

• Packet Loss and Retransmission• Recognizing packet loss• Identifying missing packets• Retransmission behavior

• TCP congestion avoidance

• TCP slow start

Page 30: 15-441 Computer Networking Lecture 16 –TCP in detail Eric Anderson Fall 2013 prs/15-441-F13.

33

Fast Recovery

• Each duplicate ack notifies sender that single packet has cleared network

• When < new cwnd packets are outstanding• Allow new packets out with each new duplicate

acknowledgement• Behavior

• Sender is idle for some time – waiting for ½ cwnd worth of dupacks

• Transmits at original rate after wait• Ack clocking rate is same as before loss

Page 31: 15-441 Computer Networking Lecture 16 –TCP in detail Eric Anderson Fall 2013 prs/15-441-F13.

34

Fast Recovery

Time

Sequence No

Sent for each dupack afterW/2 dupacks arrive

X

Packets

Acks

Page 32: 15-441 Computer Networking Lecture 16 –TCP in detail Eric Anderson Fall 2013 prs/15-441-F13.

35

Performance Issues

• Timeout >> fast rexmit

• Need 3 dupacks/sacks

• Not great for small transfers• Don’t have 3 packets outstanding

• What are real loss patterns like?

Page 33: 15-441 Computer Networking Lecture 16 –TCP in detail Eric Anderson Fall 2013 prs/15-441-F13.

36

Outline

• TCP connection setup/data transfer

• Packet Loss and Retransmission

• TCP congestion avoidance

• TCP slow start

Page 34: 15-441 Computer Networking Lecture 16 –TCP in detail Eric Anderson Fall 2013 prs/15-441-F13.

37

Additive Increase/Decrease

T0

T1

Efficiency Line

Fairness Line

User 1’s Allocation x1

User 2’s Allocation

x2

• Both X1 and X2 increase/ decrease by the same amount over time• Additive increase

improves fairness and additive decrease reduces fairness

Page 35: 15-441 Computer Networking Lecture 16 –TCP in detail Eric Anderson Fall 2013 prs/15-441-F13.

38

Muliplicative Increase/Decrease

• Both X1 and X2 increase by the same factor over time• Extension from

origin – constant fairness

T0

T1

Efficiency Line

Fairness Line

User 1’s Allocation x1

User 2’s Allocation

x2

Page 36: 15-441 Computer Networking Lecture 16 –TCP in detail Eric Anderson Fall 2013 prs/15-441-F13.

39

What is the Right Choice?

• Constraints limit us to AIMD• Improves or

keeps fairness constant at each step

• AIMD moves towards optimal point

x0

x1

x2

Efficiency Line

Fairness Line

User 1’s Allocation x1

User 2’s Allocation

x2

Page 37: 15-441 Computer Networking Lecture 16 –TCP in detail Eric Anderson Fall 2013 prs/15-441-F13.

40

TCP Congestion Control

• Changes to TCP motivated by ARPANET congestion collapse

• Basic principles• AIMD• Packet conservation• Reaching steady state quickly• ACK clocking

Page 38: 15-441 Computer Networking Lecture 16 –TCP in detail Eric Anderson Fall 2013 prs/15-441-F13.

41

AIMD

• Distributed, fair and efficient• Packet loss is seen as sign of congestion and results in a

multiplicative rate decrease • Factor of 2

• TCP periodically probes for available bandwidth by increasing its rate

Time

Rate

Page 39: 15-441 Computer Networking Lecture 16 –TCP in detail Eric Anderson Fall 2013 prs/15-441-F13.

42

Implementation Issue

• Operating system timers are very coarse – how to pace packets out smoothly?

• Implemented using a congestion window that limits how much data can be in the network.• TCP also keeps track of how much data is in transit

• Data can only be sent when the amount of outstanding data is less than the congestion window.• The amount of outstanding data is increased on a “send” and

decreased on “ack”• (last sent – last acked) < congestion window

• Window limited by both congestion and buffering• Sender’s maximum window = Min (advertised window, cwnd)

Page 40: 15-441 Computer Networking Lecture 16 –TCP in detail Eric Anderson Fall 2013 prs/15-441-F13.

43

Packet Conservation

• At equilibrium, inject packet into network only when one is removed• Sliding window and not rate controlled• But still need to avoid sending burst of packets would

overflow links• Need to carefully pace out packets• Helps provide stability

• Need to eliminate spurious retransmissions• Accurate RTO estimation• Better loss recovery techniques (e.g. fast retransmit)

Page 41: 15-441 Computer Networking Lecture 16 –TCP in detail Eric Anderson Fall 2013 prs/15-441-F13.

44

• Congestion window helps to “pace” the transmission of data packets

• In steady state, a packet is sent when an ack is received• Data transmission remains smooth, once it is smooth• Self-clocking behavior

Pr

Pb

ArAb

ReceiverSender

As

TCP Packet Pacing

Page 42: 15-441 Computer Networking Lecture 16 –TCP in detail Eric Anderson Fall 2013 prs/15-441-F13.

46

Congestion Avoidance

• If loss occurs when cwnd = W• Network can handle 0.5W ~ W segments• Set cwnd to 0.5W (multiplicative decrease)

• Upon receiving ACK• Increase cwnd by (1 packet)/cwnd

• What is 1 packet? 1 MSS worth of bytes• After cwnd packets have passed by approximately increase of

1 MSS

• Implements AIMD

Page 43: 15-441 Computer Networking Lecture 16 –TCP in detail Eric Anderson Fall 2013 prs/15-441-F13.

48

Congestion Avoidance Behavior

Time

CongestionWindow

Packet loss+ retransmit

Grabbingback

Bandwidth

CutCongestion

Windowand Rate

Page 44: 15-441 Computer Networking Lecture 16 –TCP in detail Eric Anderson Fall 2013 prs/15-441-F13.

49

How to Change Window

• When a loss occurs have W packets outstanding• New cwnd = 0.5 * cwnd

• How to get to new state without losing ack clocking?

Page 45: 15-441 Computer Networking Lecture 16 –TCP in detail Eric Anderson Fall 2013 prs/15-441-F13.

50

Outline

• TCP connection setup/data transfer

• Packet Loss and Retransmission

• TCP congestion avoidance

• TCP slow start

Page 46: 15-441 Computer Networking Lecture 16 –TCP in detail Eric Anderson Fall 2013 prs/15-441-F13.

51

Reaching Steady State

• Doing AIMD is fine in steady state but slow…• How does TCP know what is a good initial rate to

start with?• Should work both for a CDPD (10s of Kbps or less) and

for supercomputer links (10 Gbps and growing)

• Quick initial phase to help get up to speed

• Called “slow start” – Why?

Page 47: 15-441 Computer Networking Lecture 16 –TCP in detail Eric Anderson Fall 2013 prs/15-441-F13.

52

Slow Start Packet Pacing

• How do we get this clocking behavior to start?• Initialize cwnd = 1• Upon receipt of every ack,

cwnd = cwnd + 1• Implications

• Window actually increases to W in RTT * log2(W)

• Can overshoot window and cause packet loss

Page 48: 15-441 Computer Networking Lecture 16 –TCP in detail Eric Anderson Fall 2013 prs/15-441-F13.

53

Slow Start Example

1

One RTT

One pkt time

0R

2

1R

3

4

2R

567

83R

91011

1213

1415

1

2 3

4 5 6 7

Page 49: 15-441 Computer Networking Lecture 16 –TCP in detail Eric Anderson Fall 2013 prs/15-441-F13.

54

Slow Start Sequence Plot

Time

Sequence No

.

.

.

Packets

Acks

Page 50: 15-441 Computer Networking Lecture 16 –TCP in detail Eric Anderson Fall 2013 prs/15-441-F13.

55

Return to Slow Start

• If packet is lost we lose our self clocking as well• Need to implement slow-start and congestion

avoidance together

• When retransmission occurs set ssthresh to 0.5w• If cwnd < ssthresh, use slow start• Else use congestion avoidance

Page 51: 15-441 Computer Networking Lecture 16 –TCP in detail Eric Anderson Fall 2013 prs/15-441-F13.

56

TCP Saw Tooth Behavior

Time

Co

ng

esti

on

Win

do

w

InitialSlowstart

Fast Retransmit

and Recovery

Slowstartto pacepackets

Timeoutsmay still

occur

ssthresh

Page 52: 15-441 Computer Networking Lecture 16 –TCP in detail Eric Anderson Fall 2013 prs/15-441-F13.

57

Important Lessons

• TCP state diagram setup/teardown

• TCP timeout calculation how is RTT estimated

• Modern TCP loss recovery• Why are timeouts bad?• How to avoid them? e.g. fast retransmit