Multicast technology: what it is, what have been done, what's next? C. Pham Univ. Lyon 1, INRIA RESO/LIP Some parts of this talk borrow materials from a tutorial presented at ICT'2003 (C. Pham & V. Roca) Tuesday, June 24 th , LIP, ENS Lyon
Jan 30, 2016
Multicast technology: what it is, what have been
done, what's next?
C. Pham
Univ. Lyon 1, INRIA RESO/LIP
Some parts of this talk borrow materials from a tutorial presented at ICT'2003 (C. Pham & V. Roca)
Tuesday, June 24th, LIP, ENS Lyon
multicast!
multicast!
How multicast can change the way people use the Internet?
multica
st!
multicast!
multi
cast
!
Everybody's talking about
multicast! Really annoying ! What
is it exactly?
multicast!
multicast!multicast!
multicast!
multicast!
multicast!
multicast!
mu
ltic
ast!
multicast!alone
multicast!
multicast!
multicast!
33
Purpose of this tutorial Provide a comprehensive overview of
current multicast basic technologies Show what are the main problems
and how they can be solved Show future directions and hot spots
in multicasting
44
This tutorial will… explain how multicast can change the
way people use the Internet present the main technologies behind
multicast, both at the routing and transport level
state on the current deployment of multicast technologies and the problems encountered for large scale deployment
55
From unicast…
Sending same data to many receivers via unicast is inefficient
Popular WWW sites become serious bottlenecks
Sender
data
datadata
data
Receiver Receiver Receiver
datadata
66
…to multicast on the Internet.
Sender
Not n-unicast from the sender perspective
Efficient one to many data distribution
Towards low latence, high bandwidth
data
datadata
data
Receiver Receiver Receiver
router at branchingpoints performpacket duplication
77
high-speed www video-conferencing video-on-demand interactive TV programs remote archival systems tele-medecine, white board high-performance computing, grids virtual reality, immersion systems distributed interactive
simulations/gaming…
New applications for the InternetThink about…
88
A whole new world for multicast…
99
The delivery models (1) model 1: streaming (e.g. for audio/video)
multimedia data requires efficiency due to its size
requires real-time, semi-reliable delivery
asynchronous
1010
The delivery models (2) model 2: push delivery
synchronous model where delivery is started at t0
usually requires a fully reliable delivery, limited number of receivers
Ex: synchronous updates of software
time
receiver ready...
receiver ready...
transmission
t0, tx starts...
ok, receiver leavesok, receiver leaves
1111
The delivery models (3) model 3: on-demand delivery
popular content (video clip, software,update, etc.) is continuously distributed in multicast
users arrive at any time, download, and leave possibility of millions of users, no real-time
constraint
time
receiver ready...receiver ready...
ok, receiver leavesok, receiver leaves
1212
A very simple example in figures File replication (PUSH) with ftp
10MBytes file 1 source, n receivers (replication sites) 512KBits/s upstream access n=100
Tx= 4.55 hours
n=1000 Tx= 1 day 21 hours 30 mins!
1313
A real example: LHC (DataGrid)
Tier2 Center
Online System
Offline Farm~20 TIPS
CERN Computer Center > ~20 TIPS
FermilabFrance Regional Center
Italy Regional Center
UK Regional Center
InstituteInstituteInstituteInstitute ~0.25TIPS
Workstations
~100 MBytes/sec
~100 MBytes/sec
~2.4 Gbits/sec
100 - 1000 Mbits/sec
Bunch crossing per 25 nsecs.100 triggers per secondEvent is ~1 MByte in size
Physicists work on analysis “channels”.
Each institute has ~10 physicists working on one or more channels
Data for these channels should be cached by the institute server
Physics data cache
~PBytes/sec
~622 Mbits/sec or Air Freight
Tier2 CenterTier2 CenterTier2 Center
~622 Mbits/sec
Tier 0Tier 0
Tier 1Tier 1
~ 4 TIPS~ 4 TIPS
Tier 3Tier 3
1 TIPS = 25,000 SpecInt95
PC (1999) = ~15 SpecInt95
Tier2 CenterTier 2Tier 2
source DataGrid
1414
Data replications
Code & data transfers, interactive job submissions
Data communications for distributed applications (collective & gather operations, sync. barrier)
Databases, directories services
Data replications
Code & data transfers, interactive job submissions
Data communications for distributed applications (collective & gather operations, sync. barrier)
Databases, directories services
Reliable multicast: a big win for grids
Multicast address group 224.2.0.1
224.2.0.1
SDSC IBM SP1024 procs5x12x17 =1020
NCSA Origin Array256+128+1285x12x(4+2+2) =480
CPlant cluster256 nodes
1515
Wide-area interactive simulations
human in the loopflight simulator
battle field simulation
displaycomputer-basedsub-marine simulator
INTERNET
(x,y,z)
1616
The challenges of multicast
SCALABILITY - SECURITY - TCP Friendliness - MANAGEMENT
SCALABILITY
SCALABILITYSCALABILITY
Part I
Getting started
1818
Multicast BONE at the ENS Lyon
SDR (Session DiscoveRy)
1919
Multicast on E-Toile (RNTL)
Demo June 5th, 2003 showing multicast on computational grids
ENS CERN
CEAROCQ
VTHD
source
2020
Demo was successfull!
source
CERN ENS
ENS ENS
Part I
Basic of IP multicast modelIP multicast routing
IP multicast
IP multicast
III
III
2222
A look back in history of multicast History
Long history of usage on shared medium networks
Resource discovery: ARP, Bootp.
1973
Ethernetradionetwork
1983
ARP (RFC 826)
1985
Bootp (RFC 951)
1986
Deering's workIP multicast
(RFC 966, 988, 1054, 1112)
2323
The Internet group model
multicast/group communications means... 1 n as well as n m
a group is identified by a class D IP address (224.0.0.0 to 239.255.255.255)
abstract notion that does not identify any host!
host_1
194.199.25.100194.199.25.100sourcesource
host_3
receiverreceiver133.121.11.22133.121.11.22
host_2
receiverreceiver194.199.25.101194.199.25.101
multicast group225.1.2.3 multicast router
Ethernet
multicast router
multicast router
host_1
sourcesource
host_2
Ethernet
receiverreceiver
host_3
site 1
site 2
Internet
receiverreceiver
multicast distribution tree
from logical view...
...to physical view
2424
The group model is an open model
anybody can belong to a multicast group no authorization is required
a host can belong to many different groups
no restriction a source can send to a group, no matter
whether it belongs to the group or not membership not required
the group is dynamic, a host can subscribe to or leave at any time
a host (source/receiver) does not know the number/identity of members of the group
2525
Example: video-conferencing
from UREC, http://www.urec.frMulticast address group 224.2.0.1
224.2.0.1
The user's perspective
2626
What's behind the scene?
domain
peering point
Internet router
access router
224.2.0.1
2727
Receivers must be able to subscribe to groups, need group management facilities
A communication tree must be built from the source to the receivers
Branching points in the tree must keep multicast state information
Inter-domain routing must be reconsidered for multicast traffic
Need to consider non-multicast clouds
IP multicast TODO list
good luck…
2828
incremental deploymentgroups managementsession advertisingtree constructionaddress allocationduplication engineforwarding state
routing
multicast islandunicast island
routing
TCP ?
2929
Multicast and the TCP/IP layered model
TCP UDP
IP / IP multicast
device drivers
ICMP IGMP
Application
Socket layer
multicastrouting
higher-levelservices
user spacekernel space
congestioncontrol
reliabilitymgmt
other buildingblocks
security
3030
The two sides of IP multicast local-area multicast
use the potential diffusion capabilities of the physical layer (e.g. Ethernet)
efficient and straightforward
wide-area multicast requires to go through multicast routers, use
IGMP/multicast routing/...(e.g. DVMRP, PIM-DM, PIM-SM, PIM-SSM, MSDP, MBGP, BGMP, MOSPF, etc.)
routing in the same administrative domain is simple and efficient
inter-domain routing is complex, not fully operational
3131
IP Multicast Architecture
Hosts
Routers
Service model
Host-to-router protocolHost-to-router protocol
Multicast routing protocolsMulticast routing protocols
3232
Internet Group Management Protocol (RFC 1112) IGMP: “signaling” protocol
to establish, maintain, remove groups on a subnet.
Objective: keep router up-to-date with group membership of entire LAN
Routers need not know who all the members are, only that members exist
Each host keeps track of which mcast groups are subscribed to
Socket API informs IGMP process of all joins
Hosts
Routers
3333
224.0.0.1 reach all multicast host on the subnet
IGMP: subscribe to a group (1)
Host 1 Host 2 Host 3
224.2.0.1224.2.0.1 224.5.5.5 224.5.5.5
periodically sendsIGMP Query at 224.0.0.1
224.2.0.1
empty empty
3434
IGMP: subscribe to a group (2)
224.2.0.1224.2.0.1 224.5.5.5 224.5.5.5
Sends Reportfor 224.2.0.1
224.2.0.1
224.2.0.1
Host 1 Host 2 Host 3
somebody has already subscribed
for the group
3535
IGMP: subscribe to a group (3)
224.2.0.1224.2.0.1 224.5.5.5 224.5.5.5
Sends Reportfor 224.5.5.5224.5.5.5
224.2.0.1
224.2.0.1224.5.5.5224.5.5.5
Host 1 Host 2 Host 3
3636
Data distribution example
224.2.0.1224.2.0.1 224.5.5.5 224.5.5.5224.2.0.1
224.2.0.1224.5.5.5224.5.5.5
Host 1 Host 2 Host 3
data224.2.0.1
OK
3737
IGMPJoin
3838
IGMP: leave a group (1)
Host 1 Host 2 Host 3
224.2.0.1
Sends Leavefor 224.2.0.1at 224.0.0.2
224.2.0.1224.5.5.5224.5.5.5
224.0.0.2 reach the multicast enabled router in the subnet
224.2.0.1 224.5.5.5 224.5.5.5
3939
IGMP: leave a group (2)
Host 1 Host 2 Host 3
224.2.0.1
Sends IGMP Query for 224.2.0.1
224.2.0.1224.5.5.5224.5.5.5
224.2.0.1 224.5.5.5 224.5.5.5
4040
IGMP: leave a group (3)
Host 1 Host 2 Host 3
224.2.0.1
Sends Reportfor 224.2.0.1
224.2.0.1224.5.5.5224.5.5.5
224.2.0.1 224.5.5.5 224.5.5.5
Hey, I'm still
here!
4141
IGMP: leave a group (4)
Host 1 Host 2 Host 3
224.2.0.1 224.2.0.1
Sends Leavefor 224.5.5.5at 224.0.0.2
224.2.0.1224.5.5.5224.5.5.5
4242
IGMP: leave a group (5)
Host 1 Host 2 Host 3
224.2.0.1 224.2.0.1
Sends IGMP Query for 244.5.5.5 224.2.0.1
4343
IGMPLeave
4444
IGMP: leave a group (5)
Host 1 Host 2 Host 3
224.2.0.1 224.2.0.1
Sends IGMP Query for 244.5.5.5 224.2.0.1
4545
OK, now I can express local interest, so what?
Host 1 Host 2 Host 3
224.2.0.1224.2.0.1 224.5.5.5 224.5.5.5224.2.0.1
224.2.0.1224.5.5.5224.5.5.5
?
4646
Does all paths lead to Roma?
source
4747
Before going further… Multicast on Ethernet LAN
How can a end-host get link-layer (MAC) packets?
Review of Ethernet filtering By default, the Ethernet device listen on
its (Ethernet) MAC address fixed in a PROM The broacast MAC address FF:FF:FF:FF:FF:FF
Other Ethernet addresses must be explicitely programmed into the driver
For multicast, one must listen at: the Ethernet-equivalent of 224.0.0.1 (all multicast
host in the LAN) The Ethernet-equivalent address on which multicast
sessions are advertised
4848
Mapping of IP multicast address A MAC address is built from a
mapping of IP multicast addr (Deering88)
LAN multicast address
0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 1 1 1 1 0 0
1 1 1 0 28 bits
23 bits
IP multicast address
Group bit 32:1 ratio
Organizationally Unique Identifier (OUI, see RFC 1700 Assigned Number
Special OUI for IETF: 0x01-00-5E
Part I
Basic of IP multicast modelIP multicast routing
224.x.y.z
5050
IP multicast routing Find a tree (dedicated, shared) between
the source(s) and the receivers Dense Mode
Assume that there are many many receivers willing to get multicast traffic
Sparse Mode Assume that the number of receivers is small.
Require an explicite query from the receivers.
5151
Dense mode protocols, DVMRP
The Ancestor: DVMRP (Distance Vector Multicast Routing)
Based on Reverse Path Forwarding (RPF)
A multicast router forwards packets received from a link which is on the shortest path to the source, and drops other packets
physical topology sourcedropped
droppedreceiver
R1 R2 R3
R4R5
5252
DVMRP... (cont’)
resulting multicast distribution tree
different sources lead to diff. trees improves load distribution on the links
creates a spanning tree…
source
source
5353
DVMRP... (cont’)
add “flood and prune” algorithm to dynamically update the tree
step 2: prune useless branchesstep 2: prune useless branches
source
receiver
PRUNE PRUNE
“pruned”“pruned” Stop, noreceiver
here!
step 1: flood the Internet (only limited by the packet’s TTL)step 1: flood the Internet (only limited by the packet’s TTL)
source
receiver
source
receiver
5454
DVMRP... (cont’) flooding/pruning is done periodically
to update the tree required to discover new receivers and
remove branches to receivers who left the session
limitations: creates signaling load (PRUNE message) periodically creates important traffic
(flooding) all routers keep some state for all the
multicast groups in use in the Internet
5555
DVMRP deployment
large scale deployment of DVMRP in the MBONE (multicast backbone) since 1992
tunnels are set up to link “multicast islands” through unicast areas
unicast only routers
multicast routersmulticast routers
source receiver
encaspsulationdst = unicast @R2
decaspsulation
R2R1
5656
Multicast tunnelling illustrated
IP multicastrouterIP a
IP multicastrouterIP b
None IP multicastrouter
None IP multicastrouter
tunnel for multicast
IP a | IP b x|224.4.4.9
224.4.4.9
IP x
x|224.4.4.9
x|224.4.4.9
224.4.4.9 ?
5757
The early MBone with tunnels
source K. Almeroth's paper. IEEE Networks Magazine, Vol.14(1)
5858
Mixing tunnels and native multicast
source K. Almeroth's paper. IEEE Networks Magazine, Vol.14(1)
5959
DVMRP on Linux: the mrouted daemon
6060
DVMRP summary it works but... this is far from perfect
periodical flooding creates a heavy load on routers/links
each multicast router must keep some forwarding state for each group
tunneling quickly became anarchic this is a flat architecture (the same protocol is used
everywhere)
conclusion: “dense mode protocols” like DVMRP are not scalable enough for WAN multicast routing
dense mode assumes a dense distribution of receivers, wrong in practice!
6161
DVMRP uses Source-based Trees
Router
Source
Receiver
S
R
R
R
R
R
S
S
Source Shivkumar Kalyanaraman
6262
Moving to a Shared Tree
RPRP
Router
Source
Receiver
S
S
S
R
R
R
R
R
Source Shivkumar Kalyanaraman
6363
Shared vs. Source-Based Trees Source-based trees
Shortest path trees – low delay, better load distribution
More state at routers (per-source state) Efficient in dense-area multicast
Shared trees Higher delay (bounded by factor of 2), traffic
concentration Choice of core affects efficiency Per-group state at routers Efficient for sparse-area multicast
Source Shivkumar Kalyanaraman
6464
Sparse mode protocols
The newcomers: PIM-SM/MSDP/MBGP PIM-SM : Protocol Independent Multicast - Sparse
Mode MSDP: Multicast Source Discovery Protocol MBGP: Multi-protocol Border Gateway Protocol
domain site, or ISP networksimilar to “autonomous systems” of unicast routing
intra-domain mcast routing uses PIM-SM inter-domain mcast routing requires MBGP the discovery of sources in other domains
requires MSDP
6565
PIM-SM Protocol Overview Basic protocol steps
Routers with local members Join toward Rendezvous Point (RP) to join shared tree
Routers with local sources encapsulate data in Register messages to RP
Routers with local members may initiate data-driven switch to source-specific shortest path trees
PIM v.2 Specification (RFC 2362)
Source Shivkumar Kalyanaraman
6666
Source 1
Receiver 1
Receiver 2
(*,G)
Receiver 3
(*,G)
(*,G)
(*,G)
(*,G)
(*,G)
Join messagetoward RP
Shared tree after R1,R2 join
RP
PIM-SM: Build Shared Tree
Source Shivkumar Kalyanaraman
6767
Source 1
Receiver 1
Receiver 2
(*,G)
Receiver 3
(*,G)
(*,G)
(*,G)
(*,G)
(*,G)
Unicast encapsulated data packet to RP in Register
RP
RP de-capsulates, forwards down shared tree
Data Encapsulated in Register
Source Shivkumar Kalyanaraman
6868
Source 1
Receiver 1
Receiver 2 Receiver 3
(S1,G)
RP
Join messagetoward S1
Shared tree
RP Send Join to High Rate Source
Source Shivkumar Kalyanaraman
6969
Source 1
Receiver 1
Receiver 2 Receiver 3
Join messages
Shared Tree
RP
Build source-specific tree for high data rate source
(S1,G),(*,G)
(S1, G)
(S1,G),(*,G)(S1,G),(*,G)
Build Source-Specific Distribution Tree
Source Shivkumar Kalyanaraman
7070
PIM-SM... (cont’)
moving to a per-source tree is efficient for bulk data transfer, but has a higher cost in case of multiple sources one tree per source versus a single
shared tree
source receiver
RP
from shared tree...from shared tree...
source
...to per-source tree...to per-source tree
source
source receiver
7171
PIM-SM on Internet routers PIM-SM is implemented on all major
Internet routers (CISCO, JUNIPER, Alcatel AVICI, PROCKET…)
A linux package exists, see http://netweb.usc.edu/pim/ (I haven’t tried it yet)
7272
Example: PIM-SM on VTHD
France TelecomR&D Diffusion of thisdocument is subjectto France Telecom authorizationD1 -23/09/02
Nalan001FT R&D
loop0 : R’.104/32
ncmso001loop0 : R.3/32
CiscoGSRJuniperM40
ncgre001loop0 : R.4/32
GEth
VTHD : Multicast dans les VPNs
ncrou001loop0 : R.6/32 ncaub001
loop0 : R.1/32
Avici TSR
R=193.252.113R’=193.252.226
Cisco 7200
ncstl001loop0 : R.2/32
R’.245/30
R’.246/30R’.246/30
Cisco7500
Juniper T640
AS 20603AS 20603
EthPos
RPRPncren001loop0 : R.7/32
CE7500
JuniperM20
nclyo001loop0 : R.142/32
FT R&DNagre001loop0 : R’.100/32
FTR&D LannionFTR&D Lannion
FT R&Dnaiss001loop0: R’.99/32
R.110/30
R.245/30
FTR&D IssyFTR&D Issy
FTR&D GrenobleFTR&D Grenoble
Naren001FT R&D
loop0 : R’.97R.83
R.82 FTR&D FTR&D RennesRennes
R.93
PEPE
PEPE
PEPE
PEPE
PEPE
Nacae001FT R&D
loop0 : R’.106/32
cecae001loop0 : R’.98/32
PEPE
nasop003
ncsop001loop0 : R.5/32
loop0 : R’.101
FTR&D FTR&D SophiaSophia
FTR&D CaenFTR&D CaenRPRP
T640
Source doc VTHD
7373
Enabling PIM
Declaring the RP
ip multicast-routing distributed
!
interface XX/XX
ip pim sparse-dense-mode
!
For each interface
Configuration on CISCO routers
ip pim rp-address w.x.y.z
IP addr of the RP
7474
Ok, now I have a tree, so what?
RPRP
Sender
Receivers
?
7575
MBGP for inter-domain connectivity
MBGP (MultiProtocol BGP, RFC 2283) is an extension to BGP4 to carry more than IPv4 route prefix (MP_REACH_NLRI)
Maintained a separate M(ulticast)-RIB The internal domain’s topology is only known to the local
MBGP router Each MBGP router only knows how to reach other
multicast domains
domain 2
domain 3domain 1 MBGP
router
MBGProuter
MBGProuter
creation of inter-domaintopology running MBGP
BGProuter
BGProuter
BGProuter
7676
BGP background (1)
From CISCO
7777
BGP background (2)
From CISCO
7878
BGP background (3)
From CISCO
7979
Multiprotocol BGP
From CISCO
8080
Ok, now I have inter-domain routing, so what?
RP
RP
RP
RP
A
B
C DSource
Where’s the sources? How can we discover them?
8181
MSDP for inter-domain src discov.
each domain runs PIM-SM with its own local RP to avoid third-party dependency
problem: how can a receiver in a domain be informed of a source located in another domain... with MSDP!
RP1source
receiver
RP2
receiver
MSDPpeer
MSDPpeer
MSDPpeer
source active (SA)message
new source detected
domain 2
domain 3
domain 1
8282
How MSDP works with PIM-SM
RP
RP
RP
RP
MSDP peer
Physical link
A
B
C D
Receiver
Source
PIM message
MSDP message
SA
SA
SA
JoinJoinJoin
Join
Join
Source Shivkumar Kalyanaraman
8383
Example: MBGP/MSDP on VTHD RP’s address is announced with MBGP External active sources are
discovered with MSDP
Border Router
e-MSDP+ eMBGPsession
RP de Rennes
VTHD:VTHD:AS 20603AS 20603eBGPsession
MSDP/MBGP configurationAS externeAS externeRP
iBGPsession
Source doc VTHD
8484
MSDP… (cont’) problem with some applications
reducing the join latency requires using a cache in each peer of active sources
follows a soft-state model, where entries must be periodically refreshed
does not work with low frequency bursty applications soft-state is lost each time a packet sent… receivers
never get any packet
limited scalability in terms of nb groups each peer informs every other peer of local sources,
and everybody knows everything !
8585
Conclusions PIM-SM/MBGP/MSDP
works, currently operational deployed in VTHD (http://www.vthd.org) deployed in the GEANT European network
http://www.dante.net/nep/GEANT-MULTICAST/
but this is not the long term solution... high signaling load for dynamic groups problems with low frequency bursty
applications limited scalability with the number of groups
long term solution may be quite different...
8686
Single-Source Multicast (SSM) Source-specific channel
(S,G) only S can send to G another source S’ must use a
separate channel (S’,G) hosts join channels, so a
member joining only (S,G) will NOT receive traffic from S’
Current infrastructure uses Any-Source Multicast (ASM) any source can send to any
group at any time
(S,G) (S’,G)
Source Shivkumar Kalyanaraman
8787
Why SSM? Network Operator
trivial address allocation (16 million addresses per host)
no network-layer source discovery (PIM RP and/or MSDP moved to the application layer)
overcomes two significant obstacles to deployment
Content Provider exclusive access to multicast groups (no
interruptions) permanent multicast groups (easy to advertise) provides better service
Source Shivkumar Kalyanaraman
8888
How SSM Works
Physical link
A
B
C D
Receiver
Source
PIM message
Join
JoinJoin
Join
Join
Join
Source Shivkumar Kalyanaraman
8989
SSM Advantages (cont’d) No RP, No need for MSDP All joins are (S,G), so no need for Class D address
allocation More security Receivers find out about sources through out-of-
band means (such as a web site) SSM-only implementations are much simpler than
the full PIM-SM No RP, No Bootstrap RP Election No Register state machine No need to keep (*,G), (S,G,rpt) and (*,*,RP) state No (*,G) Assert State
Source Shivkumar Kalyanaraman
9090
Source specific multicast... (cont’)
works with limited modifications of current protocols
use IGMPv3 in hosts and 1st hop routers use a modified version of PIM-SM (no RP, use
directly to the per-source tree)
probably the future of IP Multicast routing…
unless the importance of many-to-many applications overwhelms SSM?
Part II
Introducing reliabilityEnd-to-end solutionsFEC-based solutionsLayered solutionsRouter-assisted solutions
9292
The Wild Wild Web
UDP data
heterogeneity,link failures,
congested routerspacket loss, packet drop,bit errors…
?
9393
Reliability Models Reliability => requires redundancy to
recover from uncertain loss or other failure modes.
Two types of redundancy: Spatial redundancy: independent backup copies
Forward error correction (FEC) codes Problem: requires huge overhead, since the FEC is
also part of the packet(s) it cannot recover from erasure of all packets
Temporal redundancy: retransmit if packets lost/error
Lazy: trades off response time for reliability Design of status reports and retransmission
optimization important
9494
Temporal Redundancy Model
Packets • Sequence Numbers• CRC or Checksum
Status Reports • ACKs• NAKs, • SACKs• Bitmaps
• Packets• FEC information
Retransmissions
Timeout
Part II
Introducing reliabilityACK/NACK end-to-end solutionsFEC-based solutionsLayered solutionsRouter-assisted solutions
9696
End-to-end reliability models Sender-reliable
Sender detects packet losses by gap in ACK sequence
Easy resource management
Receiver-reliable Receiver detect the packet losses and
send NACK towards the source
9797
Challenge: scalability (1) many problems arise with 10,000
receivers... Problem 1: scalable control traffic
ACK every 2 packets (à la TCP)...oops, 10000ACKs / 2 pkt!
NAK (negative ack) only if failure... oops, if pkt is lost close to the source, 10000 NAKs!
source implosion!
NACK4NACK4
NACK4
NACK4
NACK4
NACK4NACK4NACK4
source
9898
Challenge: scalability (2) problem 2: scalable repairs/exposure
receivers may receive several time the same packet
NACK4
NACK4
NACK4
NACK4
data4
data4
data4data4
data
4
data4
data4
data
4
data4
data4
data4
data4
data4
data4
9999
solutions to problem 1: scalable control traffic solution 1: feedback suppression at the receivers
each node picks a random backoff timer send the NAK at timeout if loss not corrected
solution 2: proactive FEC (forward error correction)
send data plus additional FEC packets any FEC packet can replace any lost data packet
solution 3: use a tree of intelligent routers/servers use a tree for ACK aggregation and/or NAK suppression PGM, ARM, DyRAM
A piece of the solutions (1)
100100
A piece of the solutions (2) solutions to problem 2: scalable repairs
solution 1: use TTL-scoped retransmissions repair packets have limited scope
solution 2: use proactive/reactive FEC proactive: always send data + FEC reactive: in case of retransmission, send FEC
solution 3: use a tree of retransmission servers a receiver can be a retransmission server if he has the
requested data
101101
Scalable Reliable MulticastFloyd et al., 1995 Receiver-reliable, NACK-based NACK local suppression
Delay before sending Based on RTT estimation Deterministic + Stochastic
Every member may multicast NACK or retransmission
Periodic session messages Sequence number: detection of loss Estimation of distance matrix among
members
102102
SRM Request Suppression
Src
from Haobo Yu , Christos Papadopoulos
103103
SRM Request Suppression
Src
from Haobo Yu , Christos Papadopoulos
next packet
104104
SRM Request Suppression
Src
from Haobo Yu , Christos Papadopoulos
105105
SRM Request Suppression
Src
from Haobo Yu , Christos Papadopoulos
each node picks a random backoff timer
106106
SRM Request Suppression
Src
from Haobo Yu , Christos Papadopoulos
each node picks a random backoff timer
each node picks a random backoff timer
107107
SRM Request Suppression
Src
from Haobo Yu , Christos Papadopoulos
each node picks a random backoff timer
each node picks a random backoff timer
each node picks a random backoff timer
108108
SRM Request Suppression
Src
from Haobo Yu , Christos Papadopoulos
109109
SRM Request Suppression
Src
from Haobo Yu , Christos Papadopoulos
110110
SRM Request Suppression
Src
from Haobo Yu , Christos Papadopoulos
111111
SRM Request Suppression
Src
from Haobo Yu , Christos Papadopoulos
112112
SRM Request Suppression
Src
from Haobo Yu , Christos Papadopoulos
113113
Deterministic Suppression
d
d
d
d
3d
time
data
nack repair
d
session msg
4d
d
2d
3d
= sender
= repairer
= requestor
Delay = C1dS,Rfrom Haobo Yu , Christos Papadopoulos
Time = T1 Time = T2
A BTime = T4 Time = T3
distance = (T4 - T3 + T2 - T1) / 2
114114
Simple TTL-scoped of repairs use the TTL field of IP packets to limit
the scope of the repair packet
Src
TTL=1 TTL=2 TTL=3
115115
Summary: reliability problems What is the problem of loss
recovery? feedback (ACK or NACK)
implosion ACK/NACK aggregation
based on timers are approximative!
replies/repairs duplications TTL-scoped
retransmissions are approximative!
Heterogeneity of receivers (crying baby, congestion control)
difficult adaptability to dynamic membership changes
Design goals reduce the feedback
traffic reduce recovery
latencies improve recovery
isolation
Part II
Introducing reliabilityACK/NACK end-to-end solutionsFEC-based solutionsLayered solutionsRouter-assisted solutions
SKIPsee ICT 03 Tutorial
http://www.ens-lyon.fr/~cpham
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
••
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
• •
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
••
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
The reliable multicast
universe
RMX
NARADA
…Application-based
RMANP
ARMDyRAM
Router assisted,active networking
AER
PGM
RMDP
Layered/FEC
ALC/LCT
Logging server/replier
LBRM
SRM
TRAM RMTP
LMS
XTPEnd to End
MTP
RMF
AFDP
10 human years (means much more in computer year)
Part III
Status and Deployment of Multicast Technologies
119119
Academics vs Users
Multicast has been around for
more than a decade, and
we've proposed many protocols!
Yes, but very few real applications
have been deployed on the
Internet!
multicast
SRM, DVMRPCBT, RMTP,LMS, MOSPF,MBGP, PIM-DM,MSDP, IGMP,RPM, HBH, LBRM,DyRAM…
120120
incremental deploymentgroups managementsession advertisingtree constructionaddress allocationduplication engineforwarding state
routing
multicast islandunicast island
routing
TCP ?inter-domain routing
tunnellingsecurity
congestion control
Connecting the two world
is difficult!
121121
Inter-domain agreement
domain
peering point
Internet router
access router
BGP
MBGP
INTERNET
122122
Users' accesses
offices
campus
residentials
Network Provider
metro ring
Network Provider
PSTN 56KbpsADSL 128/512 KbpsCable shared 10MbpsISDN 128Kbps…
CORE NETWORKGbps, DWDM
InternetDataCenter
OC-12
OC-3
100BaseTX
OC-12
OC-3
OC-3 2Mbps, FR
small offices
123123
Links heterogeneity Backbone links
optical fibers 2.5 to 160 Gbps with DWDM techniques
End-user access 9.6Kbps (GSM) to 2Mbps (UMTS) V.90
56Kbps modem on twisted pair 64Kbps to 1930Kbps ISDN access 128Kbps to 2Mbps with xDSL modem 1Mbps to 10Mbps Cable-modem 155Mbps to 2.5Gbps SONET/SDH
124124
Internet routers: key elements of internetworking
Routers run routing protocols and build
routing table, receive data packets and
perform relaying, may have to consider Quality
of Service constraints for scheduling packets,
are highly optimized for packet forwarding functions.
125125
Multicast in Points of Presence
A
B
C
POP1
POP3POP2
POP4 D
E
F
POP5
POP6 POP7POP8
source N. McKeown
126126
Multicast, a threat for high-performance routers!
Please!Don't turn
multicast ON!
127127
The open model
CONTRACT
Can not control sources
Can not control receivers
Can not control groups
Can not control traffic
Please sign ??
no-security
128128
BGP table size
source www.multicasttech.com/status
129129
MBGP table size
source www.multicasttech.com/status
BGP ~118000
130130
Relative Size of the Multicast Enabled Internet
source www.multicasttech.com/status
131131
The gap in images
multicast ASunicast AS
INTERNET
132132
Autonomous Systems in the Multicast Enabled Internet: Totals and Those With Active Sources
source www.multicasttech.com/status
~33%
133133
Last solution… if you don't have access to IP Multicast you could
try using: Overlays, End-system Multicast, Host-level, Application-
level Multicast
MIT1
MIT2
CMU1
CMU2
UCSD
MIT1
MIT2
CMU2
Overlay Tree
Berkeley
CMU1
CMU
Berkeley
MIT
UCSD
source Yang-hua Chu
134134
Conclusions (1) Multicast: a technology with high
potential… … but also awfully complex !
Technology starts to be mature: problems are well known and some protocols
are already standardized (ALC family) ACK/NACK protocols are on the way to
standardization (takes more time as problems are tougher)
does not prevent the use of private reliable multicast solutions
135135
Conclusions (2) Deployment is mainly driven by
academic networks… where are the killing applications ? video and popular content distribution to
clients… yes high performance computing over
datagrids… yes Where should we go?
More specific models (i.e. SSM), More security, more control