Layering and TCP/IP Stack Nicolas Montavont [email protected]Universidad de los Andes Merida, Venezuela May 2011 ULA - May 2011 page TCP / IP network About the slides ! These slides are part of the module on Voice over IP at Universidad de Los Andes - Venezuela ! Many thanks to (alphabetic order) • Annie Gravey (Telecom Bretagne) • David Ros (Telecom Bretagne) • Emil Ivov (jitsi) • G6 - the french IPv6 task force • German Castignani (Telecom Bretagne) • Gilbert Martineau (Telecom Bretagne • Kurose and Ross - Computer Network, a top down approach • Laurent Toutain (Telecom Bretagne) • Xavier Lagrange (Telecom Bretagne) 2
83
Embed
Layering and TCP/IP Stack - webdelprofesor.ula.vewebdelprofesor.ula.ve/ingenieria/amoret/redes/voip/2-TCP-IP.pdf · Layering and TCP/IP Stack ... •Operation mode: broadcast and
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
- Discussion within mailing lists and at the IETF meetings–Try to make the draft become a working group item:draft-ietf-xxxwg-my-subjet-00.txt
–Reach a consensus on the Mailing list (known as Last Call)–Give the document to an Area Director–Last call in all groups–If accepted: send it to RFC Editor (and IANA, if values need
to be allocated)- RFC : proposed standard, then draft standard et finally standard
The IETF Internet Engineering Task Force
ULA - May 2011page TCP / IP network30
RFCs
!Different classes of RFC :• Documents coming from the standardization process (proposed
standard, draft standard, standard)
• Other- Experimental- BCP (Best Current Practice)- Informational- ...
• Be careful to the date!- RFC 1149 (April 1st 1990) : A Standard for the Transmission of IP Datagrams on
Avian Carriers- RFC 2549 (April 1st 1999) : IP over Avian Carriers with Quality of Service- RFC 3514 (April 1st 2003) : the evil bit in the IP header
ULA - May 2011page TCP / IP network31
Standards evolution
!In general, one protocol ! one RFC
• Example : TCP- RFC 793 (first specification)- RFC 1122 (Requirements for Internet Hosts)- RFC 1323 (Extensions for High Performance)- RFC 2018, 2883 (Selective Acknowledgment)- RFC 2581 (Congestion Control)- RFC 2988 (Retransmission Timer)- etc. etc. ...
ULA - May 2011page TCP / IP network32
Research activities: IRTF
!http://www.irtf.org/
!Foresee at a much longer term!Sometimes groups are private
• Decided by the chair
• Mailing lists are public
!Some examples• End-to-end (closed today)
• Anti-spam
• Virtual networks
• DTN (Delay/Disruption Tolerant Network)
!Nowadays, research is made through scientific publication
Ethernet and 802.3
ULA - May 2011page TCP / IP network34
Expand the link layer
Application
Transport
Network
Link
Physical
LLC (Logical Link Control)
Error control, flow control
MAC (Medium Access Control)
Share the link, addressing
{
ULA - May 2011page TCP / IP network35
Local Area Network
!Media shared by several devices• Operation mode: broadcast and define how to access
- No centralized operation
!If the media is share, we need• To be able to identify a device ! addresses
• Rules to access the medium
!Some properties• No configuration required to operate
• Need not be scalable
• Addresses still need to be unique
ULA - May 2011page TCP / IP network36
Architecture
ULA - May 2011page TCP / IP network37
IEEE 802.3 and Ethernet
!Ethernet vs. IEEE 802.3 :• Same physical layer
• Same medium access method
• MAC frame:- Same format...- ...but different usage of one field
! Two non-compatible protocols
ULA - May 2011page TCP / IP network38
IEEE addressing
!Made of 2 parts
• Manufacturer part (named OUI code), bought to IEEE by a manufacturer, guarantee the uniqueness
• Identification part (serial number)- For a given manufacturer, must be unique
!The MAC address format is the same whatever the protocol (Ethernet, Wifi, ...)
• Ease network interconnection
ULA - May 2011page TCP / IP network39
Format des adresses MAC (norme IEEE 802.1)
! Unique address (worldwide): more or less 1014 different address
! Access all OUI code: http://standards.ieee.org/regauth/oui/index.shtml
I/G U/L 46 bits
0 = I : individual address1 = G : group address
0 = U : universal address1 = L : local address
0 U/L 46 bits
3 bytes: manufacturer code (OUI) 3 bytes: serial number
ULA - May 2011page TCP / IP network40
MAC address format (802.1)
OUI code on 3 bytes, hexadécimal Manufacturer
00-00-0C Cisco
00-03-93 Apple
02-80-8C 3Com
08-00-20 Sun
08-00-5A IBM
OUI codeOUI code
!Three addresses family, three operating modes• Point-to-point (unicast) : one device
• Broadcast: designate all equipments on the network (FF-FF-FF-FF-FF-FF)
• Restricted broadcast (multicast) : designate a subset of all equipments (first bit of address equal to 1)
ULA - May 2011page TCP / IP network41
Trame de D à B
Unicast
!D send the message on the media in broadcast!All stations network interfaces receive the message!Only the interface having the destination MAC address configured
will forward the message up to the stack• Layer 2 filtering: done by the card
ULA - May 2011page TCP / IP network42
Access mode: CSMA/CD
!Carrier Sense Multiple Access / Collision Detect• CSMA : before transmitting, the sender probes the channel to
detect current transmission
• CD : the sender checks wether someone else is also sending at the same time (= collision)
!Collision = bad reception• Frames need to be transmitted again
ULA - May 2011page TCP / IP network43
CSMA/CD : simplified algorithm
1. If the channel is free, then send the frame2. If the channel is busy, wait for it to be idle, and
then send when it’s idle3. If a collision occurs
a.Stop the transmission
b.Wait a random time and go back 1
ULA - May 2011page TCP / IP network44
Transmission on an idle media
A probes the channel:channel free ! sending
ULA - May 2011page TCP / IP network45
Sending on a busy channel
B probes the channel:channel busy ! wait ...
... until the channel is idle
A probes the channel:channel free ! sending
ULA - May 2011page TCP / IP network46
A probes the channel: it’s idle, let’s send
propagation delay
B probes the channel : detects a signal, wait
station A
station B Time
collision
B detects that the channel is idle ! send the frame
!Collision because channel propagation...
A detects a collision ! stops transmitting
ULA - May 2011page TCP / IP network47
Frame size
!We need to have upper bound and lower bound frame size
!Maximum size
• Goal: avoid a station to use the channel too long
• Fixed to 1518 bytes
!Minimum size
• Goal: help in detecting collision (see next slide)
• Fixed to 64 bytes
ULA - May 2011page TCP / IP network48
If frames were too small
A
B
tim
e
collision
max. propagation delay = "
sending time < 2 "
= 2 "
C
D
A et B : stations that are far away
In this example, A et B successfully
transmitted their messages, but:
• A and B do not detect the collision
• C receives correctly the frame from
A, but not from B
• D receives correctly the frame from
B, but not from A
By adding padding (additional bits at
the end of the frame), in order to make
the sending time at least two times the
propagation delay, we avoid this
problem
ULA - May 2011page TCP / IP network49
Si la taille minimale des trames est bornée...
tem
ps
collision détectée par A2 "
Exemple avec durée minimale de trame = 2 " + #
A
B
C
D
Envoi de données de brouillage (renforçant la collision)pendant un temps $ < #
" 2 " + $
Dans cet exemple, la durée minimale d’émission est > 2 fois le délai de propagation
! A voit le canal libre et commence à émettre
! B voit le canal libre et commence à émettre
! B se rend compte presque immédiatement de la collision
! B poursuit la transmission pendant quelques instants, afin que la collision soit bien décelable par les autres équipements
ULA - May 2011page TCP / IP network
Frame format
50
ULA - May 2011page TCP / IP network51
Encapsulation at the physical layer
7 bytes 1 byte from 64 to 1518 bytes
Preambule Starting of a frame MAC data
7 % (101010102) = 101010112
next frame
Preambule
Minimum inter-frames silence (IFS)
!Preambule: allow the receiver to get synchronised (101010102 = squared signal in Manchester coding)
! Inter-frame silence: allows to separate two successive frames• 802.3 / Ethernet at 10 Mbit/s : IFS = 9,6 !s
!More details on CSMA/CD and backoff algorithm!LLC - Logical Link Control!Type of cable and bandwidth!Switch!Bridge - spanning tree algorithm!Virtual LAN (VLAN)
The IP layer
ULA - May 2011page TCP / IP network
57
IPv4 header format
1873
Data (from the upper layer)
Options (if any)
IP destination address
IP source address
header ChecksumProtocolTime to Live (TTL)
FragmentFlagsIdentification
Total length (in bytes)Type of serviceheader length
!Total length (16 bits)• Theoretical maximum = 216 – 1
• IP over Ethernet : needed to distinguish the information from padding
& 0 bytes & 0 bytes
Destination
address
Source
address
protocol = IP
(0x800)IP packet padding CRC
sent to the upper layer
Ethernet frame
ULA - May 2011page TCP / IP network59
Type of service (TOS)
!Current definition (RFC 2474, 3168) : differentiation of service (DiffServ) + congestion notification (ECN)
• DSCP field :- 6 values for Class Selector (compatible with the previous Priority field),
- 12 values for Assured Forwarding
- 1 value for Expedited Forwarding
0 1 2 3 4 5 6 7
DiffServ Code Point (DSCP)DiffServ Code Point (DSCP)DiffServ Code Point (DSCP)DiffServ Code Point (DSCP)DiffServ Code Point (DSCP)DiffServ Code Point (DSCP) ECNECN
ULA - May 2011page TCP / IP network60
Fragmentation
!If the link MTU does not allow transporting the packet• Send the packet in fragments
!Reassembling is performed by the destination!Costly for routers
Dessin : [Tanenbaum, 2002]
ULA - May 2011page TCP / IP network61
Fragmentation: header fields
! Identification: unique number (for the sender)
• If the packet is fragmented again, all fragments will still have this number
! Position of the fragment: position of 1st byte of the fragment in the original datagram
• Cut of multiple of 8 bytes
! DF (don’t fragment) = 1: the packet must not be fragmented
• If a fragmentation is needed: the packet is discarded and an ICMP message is returned to
the source
! MF (more fragments)
• MF = 0 Last fragment
! Flags by default (for a non-fragmented packet) : DF = MF = 0
16 bits 1 bit 1 bit 1 bit 13 bits
Identification 0 DF MF Position of the fragment
ULA - May 2011page TCP / IP network62
Fragmentation: an example
IDDF
MF place
123456 0 0 0 0
E R1 R2 Dmax. data = 4096 max. data = 1024 max. data = 512
123456 0 0 1 0
123456 0 0 0 128
(2021 bytes of data)
(1024 bytes)
(997 bytes)
123456 0 0 1 0
(512 bytes)
123456 0 0 1 64
(512 bytes)
123456 0 0 1 128
(512 bytes)
123456 0 0 0 192
(485 bytes)Identification 0 DF MF Position
ULA - May 2011page TCP / IP network63
Time To Live field (TTL)
!Initialised with a value > 0• Typical value = 64
!-1:• Each time a packet crosses a router
• Once per second, if the packet is waiting reassembling at the destination
A B
R1
R2
R3
Routing loop (when routing tables are
erroneous)
ULA - May 2011page TCP / IP network64
Protocole field
Ethernet / SNAP
IPARP ICMP
UDPTCP
pingDHCP
Transport
Network
Link
Application
type = 0x800type = 0x806
protocol = 1
protocol = 6 protocol = 17
traceroute
ULA - May 2011page TCP / IP network
IP addresses
!Why do we need addresses?
• Is it identification then?
• Location only!
!How many addresses per node? Who get an address??!Possible analogy
• Fixed telephone number? Almost...
65
ULA - May 2011page TCP / IP network66
IP Addresses
!32 bits• Human representation: 4 blocks of 1 byte, separated by a dot
• Decimal representation
! Includes a subnet (network) address and an interface address
• Delimited by the netmask: 131.254.100.48/24
10000011 11111110 01100100 00110000
131 254 100 48. . .
netmask
Prefix Identifier
11111...11111 000...0000
32 bits
variable
ULA - May 2011page TCP / IP network67
IP Adresses: subnets
Router
@ Net1
PC1
PC2
PC3
@ Net2
PC4 PC5
10.1.1.101/24
10.1.1.102/24
10.1.1.103/24
10.1.1.1/24
10.1.2.1/24
10.1.2.101/24 10.1.2.102/24
ULA - May 2011page TCP / IP network68
IP Adresses and Netmask: example
Which part belongs to the network, and which one belongs to the interface
Dessin : [Toutain, 2003]
Network Interface
IP address
ULA - May 2011page TCP / IP network69
Particular IP addresses
!Loopback: no packet are sent on the network!Broadcast: Reach all nodes on the local network
• Matching between IP diffusion and MAC diffusion
From: [Tanenbaum, 2002]
This host
A host in this network
Broadcast on the local network
Broadcast on a distant network
Loopback
Network
Any value
ULA - May 2011page TCP / IP network
What do we use?
70
What do we want to use? What the network is actually using?
Names Addresses
Resolution name system: Domain Name System
Where is ietf.org?
It is at 56.54.29.4
ULA - May 2011page TCP / IP network71
IP addresses: allocation and thoughts
! IP address: unique identification of a global scope
• Addressing: need to be scalable
!32 bits: (in theory) gives 232 # 4 billions different addresses
• Fixed size to simplify routing decision- Address management per packet (datagram!)
!How IP addresses can be allocated??
• If random choice- what about duplication? (address unicity)
- Routing?
• Fixed allocation to nodes- What about nomadism?
• Need a performant allocation system- Finite address space: no waste!
ULA - May 2011page TCP / IP network
IP addresses allocation
!Before 1994
• Classfull addressing
!After 1994
• Classless addressing called CIDR (Classless Interdomain Routing)
!Why do we need always more??*
• Mobile devices
• “Always on connections” - compared to dial up modems
• Internet demographics
• Inefficient address space use
• Virtualization
72
*Source: Wikipedia - http://en.wikipedia.org/wiki/IPv4_address_exhaustion
ULA - May 2011page TCP / IP network73
IP addresses classes (before 1994)
Dessin : [Tanenbaum, 2002]
!126 réseaux classe A de # 16 % 106 machines chacun!Environ 16 % 103 réseaux classe B de # 65 % 103 machines
chacun!Environ 2 % 106 réseaux classe C de # 250 machines chacun
ULA - May 2011page TCP / IP network74
IP addressing (before 1994)
!Flat addressing space• No hierarchical numbering
• Management through a central entity - Network Information Center
• No relation between an address and the geographic position: simplify the administration- 128.92 / 16 = IntelliCorp (US)- 128.93 / 16 = INRIA (France)
!However...
• Very inefficient use of the address space- Choice between a class C (/24 - 254 possible hosts) and class B (/16 - 65 000 possible hosts)
- Depletion of the class B address space...
!Évolution• CIDR - classless addresses
• Adresses privées + NAT
• IPv6 (addresses on 128 bits)
ULA - May 2011page TCP / IP network75
Network Address Translation
! Basics (RFC 3022)
• Share IP address(es) between several users
• The number of IP address is usually smaller than the number of clients- Example: a company network
! Network Address and Port Translation
• Usage of the port number to differentiate users
• Inside the LAN, each device has a (private) different IP address– Typically 10.0.0.0 / 8
• From the outside, only a small number of IP addresses are used– Public (routable) addresses
! Interface between the LAN & Internet = NAT box
• Dynamic conversion of the addresses for incoming and outgoing packets
! Matching between network address (IP) ' MAC address
• Applications only handle IP addresses
• Frames are exchanged using MAC address
! We need to know the destination MAC address to send a frame
! May need to use it for duplication detection! Dynamic cache
• Built and updated by the system
• Each line has a finite lifetime
morrocoy[15:24]% arp -a
default-gw.irisa.fr (131.254.1.1) at 0:4:80:13:69:0
air.irisa.fr (131.254.60.130) at 8:0:20:89:58:95
sky.irisa.fr (131.254.60.147) at 8:0:20:ac:44:3
cuvert1.irisa.fr (131.254.70.14) at 8:0:11:13:99:e5
ULA - May 2011page TCP / IP network101
How does it work?
!If the destination address is not known (not in the table) ! issue an ARP request: Ethernet frame in broadcast
SenderDestination
(IP = (, MAC = $)
ARP request(broadcast)
ARP Response(point to point)
Who has IP = ( ?
That’s me (MAC @ = $)
ULA - May 2011page TCP / IP network102
ARP - duplication detection (1/2)
!When I configure the IP address! ARP request: broadcast an Ethernet frame
sourceDevice with an address collision
(IP = (, MAC = $)
ARP request(broadcast)
ARP response(point to point)
Who has IP = ( ?
That’s me (MAC address = $)
Configuration of the IP address = (
!
The address is already in use
ULA - May 2011page TCP / IP network103
ARP - duplication detection (1/2)
!When I configure the IP address! ARP request: broadcast an Ethernet frame
source
ARP Request(broadcast)
Who as IP = ( ?
Configuration of the IP address = (
Ok. I can use the address
OK
1 s
ULA - May 2011page TCP / IP network
IPv6 address configuration
104
ULA - May 2011page TCP / IP network
Addressing scheme
!RFC 4291 defines current IPv6 addresses
• loopbak (::1)
• link local (fe80::/10)
• global unicast (2000::/3)
• multicast (ff0::/8)
!Use CIDR principles
• Prefix / Prefix length notation
• 2001:660:3003::/48
• 2001:660:3003:2:a00:20ff:fe18:964c/64
! Interfaces have several IPv6 addresses
• At least a link local and a global unicast address
105
ULA - May 2011page TCP / IP network
An IPv6 address - 128 bits
106
!Global unicast address
!Link local address
2001:660:3003:1:34CA:3B73:6543:210F /64
ULA - May 2011page TCP / IP network
Let’s assume you want to configure it yourself
!You need:
• Prefix information
• Interface ID
107
ULA - May 2011page TCP / IP network
Interface ID assignment
! Derived from a L2 ID (i.e., MAC address)! Manually assigned
• To keep the same address when Ethernet card or host is changed
• To easily remember the address- 1,2,3...- Last digit of the v4 address
! Random value
• Change frequently (e.g., every day, per session, at reboot)
• Guaranty anonymity
! Hash or other value• to link the address to other properties
- public key- list of assigned prefixes
• Mainly for security purpose
108
ULA - May 2011page TCP / IP network
Neighbor Discovery
!IETF protocols
• RFC 4861 : Neighbor Discovery for IPv6
• RFC 4862 : IPv6 Stateless Address Configuration
• RFC 4135 : Goals of detecting network attachment
!Mechanisms
• Router discovery
• Prefix discovery
• Address resolution
• Address Auto-configuration
109
ULA - May 2011page TCP / IP network
Router Advertisement
!Periodic message sent by routers!Message sent in response to a Solicitation! ICMP message
Router Advertisement
110
ULA - May 2011page TCP / IP network
Possible Options
111
! Source Link Layer address
• Source link layer address of the router
! MTU
• Indicates the recommended size for the MTU on the link
! Prefix Information
• Indicates the prefix(es) used on the link
ULA - May 2011page TCP / IP network
Router and prefix discovery
! Identification of the prefix(es) used on the link
• Determine the destinations that are on-link
• Prefix used for address auto-configuration
! Identification of the on-link router(s) and default router!The information from a router is only a part of the information of
the link
• The reception of an RA must not erase the previous configuration
112
ULA - May 2011page TCP / IP network
Receipt of a RA
!The node adds the source address of the RA in its router list! If the Source Link Layer is set, the node registers the
association between the link local address and the link layer address in the Neighbor Cache
!Prefix option
• If the OnLink flag is set, it means that the prefix is used on the link
• Add this prefix in the list of prefixes
113
ULA - May 2011page TCP / IP network
Next hop determination
! Usage of a set of tables
• Destination cache
• Prefix list
• Default router list
• Neighbor cache
! Algorithm
• Look for the destination address in the Destination Cache
• Compare the destination IP address with prefixes => longest prefix match
- If the destination is on-link, the destination address remains unchanged
- If the destination is not on-link, the node selects a router as a next hop
• Determination of the link layer address
• Memorization of the choice in the Destination Cache
114
ULA - May 2011page TCP / IP network
Link layer address determination
!Use of the following messages
• Neighbor Advertisement
• Neighbor Solicitation
!Assumption: The node knows the destination IP address, and looks for the link layer address
• Send a Neighbor Solicitation
• Receive a Neighbor Advertisement form the target destination
• Record the correspondence between the IP address and the link layer address in the Neighbor Cache
115
ULA - May 2011page TCP / IP network
Auto-configuration
! Goal: create a valid address for a given link, without human intervention
! Several steps
• Create a link local address by concatenation of the link local prefix and the link layer address of the interface- Prefix : fe80::/64- Link layer address: 00:16:cb:b9:50:bd- EUI-64 identifier of the interface: 02:16:cb:ff:fe:b9:50:bd- Generated address: fe80::216:cbff:feb9:50bd
• Check the address unicity - Duplicate Address Detection- Send a Neighbor Solicitation to the destination address that has been created
• Reception of the prefix sent by the router (Router Advertisement)
• Creation of the global IP address by concatenation of the prefix with the link layer address- Prefix: 2001:660:7301:d170 / 64- Link layer address: 00:16:cb:b9:50:bd- Global address: 2001:660:7301:d170:216:cbff:feb9:50bd
• Optional check of the unicity of the global IP address
116
ULA - May 2011page TCP / IP network
Link layer address - EUI-64 Conversion
117
ULA - May 2011page TCP / IP network
Neighbor Unreachability Detection
!Check wether a device is reachable
• In case of mobility, check that the previous router is still on-link
!Send a Neighbor Solicitation to the target IP address
118
ULA - May 2011page TCP / IP network
Stateless auto-configuration
119
Time T=0: Router is configured with a link-local address and manually configured with a global address ("::/64 is given by the network manager)
FE80::IID1"::IID1/64
FE80::IID2"::IID2/64
t=0
ULA - May 2011page TCP / IP network
Stateless auto-configuration
120
Host attaches to the link and builds its link local address based on the interface MAC address
FE80::IID1"::IID1/64
FE80::IID2"::IID2/64
t=1
t=0 - Router is configured
ULA - May 2011page TCP / IP network
Stateless auto-configuration
121
Host does a DAD (i.e., sends a Neighbor Solicitation to query resolution of its own address: no answer means no other host has this value)
No answer after a timeout => ok
FE80::IID1"::IID1/64
FE80::IID2"::IID2/64
t=2
t=1 - Link local address configuration
t=0 - Router is configured
Neigbor Solicitation (FE80::ID2)
ULA - May 2011page TCP / IP network
Stateless auto-configuration
122
Host sends a Router Solicitation to the All Router Multicast group using the newly link-local configured address
FE80::IID1"::IID1/64
FE80::IID2"::IID2/64
t=0 - Router is configured
t=3
t=2 - DAD on link local address
t=1 - Link local address configuration
Router Solicitation
ULA - May 2011page TCP / IP network
Stateless auto-configuration
123
Router directly answers to the host using Link-local addresses. The answer may contain a / several prefix(es). Router can also mandate hosts to use DHCPv6 to obtain prefixes (state full auto-configuration) and / or other parameters (DNS servers, ...).
FE80::IID1"::IID1/64
FE80::IID2"::IID2/64
t=1 - Link local address configuration
t=0 - Router is configured
t=4
t=3 - DAD on global address
t=2 - Request for a Router Advertisement
Router Advertisement (")
ULA - May 2011page TCP / IP network
Stateless auto-configuration
124
Host performs a DAD (i.e., sends a Neighbor Solicitation to query resolution of its own global address: no answer means no other host has this value)
FE80::IID1"::IID1/64
FE80::IID2"::IID2/64
t=2 - DAD on link local address
t=0 - Router is configured
t=1 - Link local address configuration
t=5
t=4 - Receives a Router Advertisement
t=3 - Request for a Router Advertisement
No answer after a timeout => ok
Neigbor Solicitation ("::ID2)
ULA - May 2011page TCP / IP network
Stateless auto-configuration
125
Host sets the global address and configures the answering router as the default router
FE80::IID1"::IID1/64
FE80::IID2"::IID2/64
t=3 - Request for a Router Advertisement
t=1 - Link local address configuration
t=0 - router is configured
t=2 - DAD on link local address
t=6
t=5 - DAD on global address
t=4 - Receives a Router Advertisement
ULA - May 2011page TCP / IP network
In summary... Neighbor Discovery
!Determine link layer address of their neighbors!Address auto-configuration (statefull and stateless)
• Layer 3 parameters (IPv6 address, default route, MTU and Hop limits)