TLP 1 Goals: Goals: Understand principles behind Transport layer services Instantiation and implementation in the Internet TLP 2 Transport Layer - Overview Understanding: Understanding: Transport layer services Transport layer services Multiplexing/ Multiplexing/ demultiplexing demultiplexing Connectionless transport: UDP Connectionless transport: UDP Principles of reliable data transfer Principles of reliable data transfer Connection Connection - - oriented transport: TCP oriented transport: TCP reliable transfer flow control connection management TCP congestion control TCP congestion control TLP 3 Transport Services and Protocols Transport Services and Protocols Provide logical communication between app’ processes running on different hosts Transport protocols run in end Transport protocols run in end systems (only) systems (only) Transport Transport vs vs network layer network layer services: services: network layer: data transfer between nodes/end systems transport layer: data transfer between processes at end systems relies on, but enhances, network layer service capability application transport network data link physical application transport network data link physical network data link physical network data link physical network data link physical network data link physical network data link physical logical end-end transport TLP 4 Transport Layer protocols Transport Layer protocols Internet transport services: Internet transport services: Reliable, in-order unicast delivery: TCP • congestion • flow control • connection setup Unreliable ( best-effort), unordered unicast or multicast delivery: UDP Services not provided by TCP: Services not provided by TCP: • real-time (need RTP, RTCP) • bandwidth guarantees • reliable multicast application transport network data link physical application transport network data link physical network data link physical network data link physical network data link physical network data link physical network data link physical logical end-end transport
18
Embed
Transport Services and Protocols Transport Layer protocols
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
TLP 1
Goals:Goals:
Understand principles behind Transport layer services
Transport Services and ProtocolsTransport Services and Protocols
Provide logical communicationbetween app’ processes running on different hostsTransport protocols run in endTransport protocols run in endsystems (only)systems (only)TransportTransport vsvs network layer network layer services:services:
network layer: data transferbetween nodes/end systemstransport layer: data transferbetween processes at end systemsrelies on, but enhances, network layer service capability
applicationtransportnetworkdata linkphysical
applicationtransportnetworkdata linkphysical
networkdata linkphysical
networkdata linkphysical
networkdata linkphysical
networkdata linkphysicalnetwork
data linkphysical
logical end-end transport
TLP 4
Transport Layer protocolsTransport Layer protocols
Internet transport services:Internet transport services:
Source portSource port Destination portDestination port
UDPUDPDataData
Message lengthMessage length ChecksumChecksum
Source portSource port Destination portDestination port
Sequence numberSequence number
Acknowledgment numberAcknowledgment number
Data offset ReservedData offset Reserved WindowWindow
ChecksumChecksum Urgent pointerUrgent pointer
OptionOption PaddingPadding
TCP dataTCP data
UURRPP
AACCKK
PPSSHH
RRSSTT
SSYYNN
FFIINN
TCP/UDP headerTCP/UDP headerDADA SASA TFTF CRCCRCDataDataIPIP HeaderHeader
• Multiplexing/Demultiplexing:
– based on IP addresses,sender’s and receiver’sport numbers
TLP 9
Port Number Description01579
111315171920212325374243537779809395
101102103104111113117119129139
ReserveTCP M ultiplexerRemote Job EntryEchoDiscardActive UsersDaytimeNetwork status programQuote of the dayCharacter generatorFTP dataFTP commandTerminal ConnectionSM TPTimeHost Name ServerW ho isDomain Name ServerPrivate RJE serviceFingerHttp protocolDevice Control ProtocolSUPDUP ProtocolNIC host name serverIOS-TSAPX.400 mail serviceX.400 mail sendingSUN RPCAuthentication ServiceUUCP-path serviceUSENET news Transfer ProtocolPassword Generator ProtocolNETBIOS Session Service
• Source port numbers~ randomly assigned by the
sending host (1024< # <65536)• Destination port numbers
~ the well-known one or theincoming source port # (# <1024)
TCP Well-known Port Numbers
TLP 10
UDP Well-known Port Numbers
PortNum ber
Description
079
111315171937424353676869111123161162512513514525
ReserveEchoD iscardA ctive UsersD aytimeN etw ork status programQ uote of the dayCharacter G eneratorTimeH ost N ame ServerW ho isD omain Name ServerBootstrap Protocol ServerBootstrap Protocol ClientTrivial File Transfer (TFTP)Sun M icrosystems RPCN etw ork Time Protocol (NTP)SNM P net monitorSNM P trapsU NIX comsatU NIX rwho daemonSystem logTime daemon
TLP 11
Assigned, Registered and Dynamic Port NumbersAssigned, Registered and Dynamic Port Numbers
Important issue in application, transport, and link layersImportant issue in application, transport, and link layers
Top of important networking topics!Top of important networking topics!
Being called when data arrives
Being called when pkt arrives
(details coming next)
Net
wo r
kla
yer
characteristics of a unreliable channel will determine thecharacteristics of a unreliable channel will determine thecomplexity of reliable data transfer (complexity of reliable data transfer (rdtrdt) protocol.) protocol.udtudt ~ unreliable data transfer protocol (IP, here)~ unreliable data transfer protocol (IP, here)
TLP 23
Reliable data transfer: getting started
rdt_send(): called from above,(e.g., by app.). Passed data to
deliver to receiver upper layer
udt_send(): called by rdt,to transfer packet over
unreliable channel to receiver
rdt_rcv(): called when packetarrives on rcv-side of channel
deliver_data(): called by rdt to deliver data to upper
sendside
receiveside
TLP 24
IP contradicts TCP ?IP contradicts TCP ?
• TCP provides completely reliable transfer• (But) IP offers best-effort (unreliable) delivery• TCP uses IP ? (YES ) How does it be done ?
Reliable Data Transmission rely on . . .- Positive acknowledgmentPositive acknowledgment
~ Receiver returns a short message (called ACK,acknowledgement) to the sender when data arrives
- Retransmission (upon timeout)Retransmission (upon timeout)~ Sender starts timer whenever a segment is transmitted~ If timer expires before acknowledgment arrives,
sender retransmits THE message
• Recall: C.O.
C.L.
TLP 25
TCP Header - I
Headlength receiver window size
• TCP packed data in “segment” but counting/tracking by bytes.
• Seq# and Ack#: Counting by bytes of data (not segments)!TLP 26
TCP Header – II
• Sequence number (SEQ # ) :- identifies each byte in the stream of data from the sending TCP to the receiving TCP (byte streams)
- numbering ranging from 0 to 232 -1 and wrapping backaround to 0
- SEQ # = (so-called) initial SEQ # (ISN) when SYN = 1(the first (data) segment = ISN + 1)
• Acknowledgment number :- the next sequence number that the receiver expects to
receive (i.e., the piggybacked ACK)
= the SEQ # of the last successfully received data byte + 1
(ACK 1 when the connection is firstly established)
(flag)
SQN is bounded to octets rather than to entire segments.
TLP 27
TCP Header – III
• Data Offset = header length (HL) in 32-bit word, (60 bytes max)
• Code bits :
- URG “urgent pointer” field is valid (when it is set to 1)
- ACK Making ACK number valid (when it is set to 1)
- PSH sender should send out all data in the sending buffer
receiver should pass this data to an application ASAP
- RST reset the connection (port unreachable)
- SYN synchronize sequence numbers to initiate a connection
- FIN sender is finished sending data (ask to close connection)
• Window (for credit allocation flow control) :indicating the number of bytes the sender is willing to accept
con
n. m
anag
emen
t
TLP 28
TCP Connection Establishment
• Establishing a connection between two ends before exchanging data
• Connection establishing protocol ~ a threethree--way handshakingway handshaking
client server
(SYN = 1, Seq# = j)(Active open)SYN_SENTSYN_SENT
Listen (passive open)SYN = j = ISN
ACK = k+1
SYN = k, ACK = j+1ISNOpen a conn.
||Open a socket
SYN_RCVDSYN_RCVD( k ~ Rxer’s seq # )
initialize TCP variables:seq. #, buffers, flowcontrol info (e.g.RcvWindow)
ConnectionEstablishedEstablished
EstablishedEstablished
- SYN consumes one sequence number- ISN should change over time (differs from connection to connection )
TLP 29
DecomposeDecompose PDUsPDUs in a TCP/IP Scenarioin a TCP/IP Scenario
Windows> telnet 140.124.70.26 (showing the first two packets sending by the client)
Protocol #: Network--Transport layer
TLP 30
Src port # (randomly generated by the src PC – 1059, here)Dest port # (an well-known for well-known application)
first packet bit arriveslast packet bit arrives, send ACK
(assuming no error)
TLP 32
Performance of StopPerformance of Stop--andand--Wait ProtocolWait Protocol(rdt3.0 – Alternating-bit protocol, textbook)
rdt3.0 works, but performance stinksrdt3.0 works, but performance stinksPerformance issue: Performance issue: Example: 1 Example: 1 GbpsGbps link, 15 ms elink, 15 ms e--e prop. delay, 1KB packet:e prop. delay, 1KB packet:
Ttransmit = 8kb/pkt10**9 b/sec = 8 microsec
(channel capacity)
(Packet size)
• Sender/channel Utilization
Utilization = U = =8 usec
30.008 msfraction of time
sender busy sendingBits into the channel
= 0.00027
(15.008 x 2, if ACK ignored)(Sender) (or 0.027%)
(ref. P.214)Send 1KB pkt every 30.008 mseceffective throughput only 267 kbps over 1 Gbps linknetwork protocol limits use of physical resources a lot!
TLP 33
Pipelined protocolsPipelined protocols (Why need ?)
Pipelining :Pipelining : allowignallowign sender to send multiple, sender to send multiple, ““inin--flightflight””,,yetyet--toto--bebe-- acknowledgedacknowledged pktspkts w/o waiting for w/o waiting for ACKsACKs
For reliable data transfer :For reliable data transfer :the range of sequence numbers must be increased (not the range of sequence numbers must be increased (not retxretx.).)need to buffer more than one packet at sender and/or receiverneed to buffer more than one packet at sender and/or receiver
Two generic forms of pipelined protocols:Two generic forms of pipelined protocols:
go-Back-N and Selective repeat
filling a pipeline
• Seq.# range and buffering depend on the manner in which a data transfer protocol responds to lost, corrupted, and overly delayed packets.
first packet bit transmitted, t = 0 (assuming no error)
first packet bit arriveslast packet bit arrives, send ACKlast bit of 2nd packet arrives, send ACK
(next cycle begins)
Increase utilizationby a factor of 3!
Usender=
.02430.008
= 0.00083 * L / RRTT + L / R
= 0.00027(0.08%)
TLP 35
Go-Back-NSenderSender ::
kk--bitbit seqseq # in # in pktpkt headerheader
““windowwindow”” of up to N, consecutiveof up to N, consecutive unAckunAck’’eded pktspkts allowed (the window size)allowed (the window size)
Preview : sliding window
ACK(n):ACK(n): ACKsACKs allall pktspkts up to, including up to, including seqseq # n ~ # n ~ ““cumulativecumulativeACKACK”” (Advantage: see Fig. 3.34)
may deceive duplicate ACKs (see receiver) ?? You find it out.
Set timer for each inSet timer for each in--flightflight pktpkt
timeout(n):timeout(n): retransmitretransmit pktpkt n and all higher n and all higher seqseq ## pktspkts in windowin windowTLP 36
GBN (Cont’d)
Receiver :Receiver :
ACKACK--only: always send ACK for correctlyonly: always send ACK for correctly--receivedreceived
pktpkt with highest with highest inin--orderorder seqseq ##
may generate duplicate ACKs
need only remember expected seqnum
outout--ofof--order packet: order packet:
discard (don’t buffer) no receiver buffering
ACK pkt with highest in-order seq #
TLP 37
GBN in action
discarddiscard
discarddiscard
discarddiscard
reTx
TLP 38
Selective Repeat/Selective Repeat/RejecctRejecct
ReceiverReceiver individuallyindividually acknowledges all correctlyacknowledges all correctlyreceivedreceived pktspkts
buffers pkts, as needed, for eventual in-order delivery to upper layer
Sender only resends Sender only resends pktspkts for which ACK not receivedfor which ACK not received
sender timer for each unACKed pkt
Sender windowSender window
N consecutive seq #’sagain limits seq #s of sent and unACKed pkts
TLP 39
Selective repeat: sender, receiver windows
(Read: Fig. 3.23-25 for Sender’s and receiver’s events and actions) TLP 40
Selective Repeat in action
Window size = 4
loss
TLP 41
Selective Repeat: a dilemmaSelective Repeat: a dilemma
ExampleExample::
seq #’s: 0, 1, 2, 3 (size = 4)
window size = 3 < Max seq #
Receiver sees no differencein both scenarios (a) and (b).
Incorrectly passes duplicate
data as new in case (a)
Q:Q: To prevent this ambiguity,To prevent this ambiguity,what should be the what should be the relationship betweenrelationship between seqseq ##size and window size?size and window size?
lost ACK scenario premature timeout, cumulative ACKs
Host A
Seq=100, 20 bytes data
ACK=100
Seq=
92 t
imeo
ut
Host B
Seq=92, 8 bytes data
ACK=120
Seq=92, 8 bytes data
Seq=
100
tim
eout
ACK=120New timeoutfor seq.=92
Host A
Seq=92, 8 bytes data
ACK=100
loss
tim
eout
time
Host B
X
Seq=92, 8 bytes data
ACK=100
Duplicated. Host B’s action?
time
TLP 46
TCP Flow Control
receiver: explicitly informssender of (dynamicallychanging) amount of freebuffer space - RcvWindow field in
TCP segmentsender: limits the amount of
transmitted, unACKeddata less than most recently received RcvWindow
- guarantees receive buffer doesn’t overflow
sender won’t overrunreceiver’s buffers by
transmitting too much,too fast
flow control
RcvBuffer = size or TCP Receive BufferRcvWindow = amount of spare room in Buffer
❒ spare room in buffer= RcvWindow
= RcvBuffer-[LastByteRcvd - LastByteRead]
TLP 47
Flow Control - Sliding Window
• To improve the utilization of the channel in the cases of Tprop > Tframe
by allowing multiple frames to be transmited before receiving ACK(s) (to improve the performance of the stop-and-wait mechanism)
• To keep track of which frames without waiting for any ACKed, each frame is labeled with sequence number.
• Rule of sliding window:
- Txer maintains a list of SEQ numbers that it is allowed to send
- Rxer maintains a list of SEQ numbers that it is prepared to receive
- Frames are numbered (0 ~ 2K-1) modulo 2K , k = # of bits in SEQ #
- The window size 2K , and the SEQ # has a bounded size since it occupies a field in the frame
- Sender must buffer these frames in case they need to be retransmitted
• Applied to Go-back-N and Selective-reject ARQ, and LLC, HDLC, and X.25
Window of frames
(?)
TLP 48
Sliding window flow control (cont’d)
ACK
ACKRR ~ Receiver Ready (in HDLC)
Example
Back to GBN
TLP 49
TCP Flow Control - Credit Allocation
• Operation:- Sending TCP includes a SEQ # of the first byte in the
segment field - Receiving TCP ACKs an incoming segment with (A=i, W=j),
whereA=i expecting SEQ = i and all SEQ prior to i are ACKedW=j granting of permission to send additional j (window)
bytes, i.e., corresponding to SEQ # in i ~ (i+j-1)
• Some examples of granting credit:
Assuming Rxer just issued (A=i, W=j )]
- Rxer issues (A=i, W=k) to increase credit to k (k > j) when no additional data have arrived
- Rxer issues (A=i+m, W=j-m) without granting additionalcredit to ACK an incoming segment containing m bytes (m < j)
TLP 50
ExampleExample
(granted permission)
Remaining credits
- sending 200 bytes/segment; sending and receiving SEQ# are synchronized through connection establishment
- initial credit = 1400 bytes, and SEQ # = 1001
+ 600
A=1001, W=1400
TLP 51
TCP Round Trip Time and Timeout
Q: How to estimate RTT?SampleRTT: measured time from segment transmission until ACK receipt, ignoreretransmissions and cumulatively ACKedsegmentsSampleRTT will vary, wantestimated RTT “smoother”
average several recent measurements, not justcurrent SampleRTT
Q: How to set TCP timeout value?longer than RTT
note: RTT will vary
too short:premature timeout,
unnecessaryretransmissionstoo long: slowreaction tosegment loss (which is unnecessary)
TLP 52
EstimatEstimation ofion of RTTRTT
- Exponential weighted moving average (why?)
- influence of given sample decreases exponentially fast
- typical value of = 0.125 (RFC 2988)
EstimatedRTT = (1- )*EstimatedRTT + *SampleRTT
100
150
200
250
300
350
1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106
time (seconds)
RTT
(milli
seco
nds)
RTT: from gaia.cs.umass.edu to fantasia.eurecom.fr
Sample RTT Estimated RTT
TLP 53
RTO (Retransmission Time Out)RTO (Retransmission Time Out)
Setting the timeout❒ EstimtedRTT plus “safety margin”
❍ large variation in EstimatedRTT -> larger safety margin❒ First estimate of how much SampleRTT deviates from
EstimatedRTT:
❒
TimeoutInterval(RTO)= EstimatedRTT + 4*DevRTT
DevRTT = (1- )*DevRTT +
*|SampleRTT - EstimatedRTT|
(typically, = 0.25)
Then set timeout interval:
TLP 54
Principles of Congestion ControlPrinciples of Congestion Control
Congestion:informally: “too many sources sending too much data too fast for network to handle”different from flow control (w.r.t. receiver)Manifestations:
lost packets (buffer overflow at routers)long delays (queueing in router buffers)
a top-10 problem!
TLP 55
Approaches towards congestion controlApproaches towards congestion control
Two broad approaches towards congestion control:
2. Network-assistedcongestion control:
❒ routers provide feedbackto end systems❍ single bit indicating
❒ CongWin is dynamic, functionof perceived networkcongestion
How does sender perceive congestion?
❒ loss event = timeout or 3duplicate acks
❒ TCP sender reduces rate (CongWin) after loss event
Three mechanisms:❍ AIMD❍ Slow start❍ Congestion Avoidance
rate = CongWinRTT Bytes/sec
TLP 59
I. TCP AIMD Congestion ControlI. TCP AIMD Congestion Control
❒ Additive Increase:~ increase CongWin
by 1 MSS every RTT in the absenceof loss events: probing
❒ Multiplicative Decrease:~ cut CongWin in half
after loss event
8 Kbytes
16 Kbytes
24 Kbytes
time
congestionwindow AIMD Operation
TLP 60
II. Slow StartII. Slow Start
When connection begins, increase rate exponentially fast untilfirst loss event.
• Operation:- Initializing cwnd = 1 (1 MSS) whenever opening a new connection- Increasing cwnd by 1 (up to a Max) every time an ACK is received- At any time, TCP measures the congestion window in segment
and restrains the transmission by
awnd = Min { credit, cwnd }
awnd = allowed window (currently allowed to send w/o receiving ACKs)cwnd = congestion window (used at startup and reduced during congestion)credit = receiver advertised window (used to calculate window/segment size)
• Slow start probes the internet to make sure not to send too manysegments into an already congested network
• Connection’s data flow is controlled by the incoming ACK (not cwnd)
TLP 61
Slow Start Operation
- A is sending 100-bytesegments
- A can fill the pipe with a continuous flow of segmentsafter approximately FOURRTTs
• Slow start may be a misnomer since cwndgrows exponentially(pretty much close to)
Initializationa new connection
Really slow ?
SN = 1
ACK = 1011st
RTT
1st RTT
SN = 101
SN = 201
ACK = 201
2nd RTT
ACK = 801
SN = 701
3rd RTT
ACK = 1501
SN = 1401
4th RTT TLP 62
III. Congestion Avoidance
• Also, Dynamic Window sizing on Congestion (Jacobson [88/95])~ modified the growth of cwnd from exponential to linear~ a way to deal with the segment loss :
a timeout occurring and receipt of duplicate ACKs
III. Congestion Avoidance
• Operation:- Begin with slow start algorithm until a congestion occurs :- Set ssthresh (a slow start threshold) = cwnd/2
- Set cwnd = 1 and perform slow start process
(i.e., increase cwnd by 1 for every ACK received)
until cwnd = ssthresh- For cwnd ssthresh, increase cwnd by one for each round-trip
time (RTT)
TLP 63
Slow start, endingwith a timeout ssthresh = 8
- check how longit would take torecover the cwndlevel beforecongestion ?
6th
RTT1st
RTT
cwnd = 9
• ExampleExample Congestion avoidance (cont’d)
Slow start, endingwith a timeout
counted asONE more RTT)
Exponentialgrowth of cwnd
Lineargrowth of cwnd
TLP 64
Comparison of Slow Start andCongestion Avoidance
1
Exponentialgrowth of cwnd
Lineargrowth of cwnd8
9
ssthresh
(RTT)
(See what the texkbook says.)
TLP 65
TCP Slow Start AlgoTCP Slow Start Algorithmrithm
initialize: Congwin = 1for (each segment ACKed)
Congwin++until (loss event OR
CongWin > threshold)
Slowstart algorithm
Host A
one segment
RTT
Host B
time
two segments
four segments• exponential increase (per RTT) in window size (not so slow!)
• loss event timeout(Tahoe TCP) and/or or three duplicate ACKs(Reno TCP)