Cross-Layer Techniques for Failure Restoration of IP Multicast with Applications to IPTV

1

IEEE/ACM COMSNETS, Bangalore, India, January 2010

Cross-Layer Techniques for Failure Restoration of IP Multicast with Applications to IPTV

M. Yuksel1, K. K. Ramakrishnan2, R. Doverspike2,

R. Sinha2, G. Li2, K. Oikonomou2, and D. Wang2

[email protected], {kkrama,rdd,sinha,gli,ko,mei}@research.att.com

1University of Nevada – Reno2AT&T Labs - Research

2


IPTV Today “Rich Media” applications like IPTV require significant

capacity The capacity requirement keeps increasing with more and

more TV channels carried over the IP backbone, and metro area network

Over 70% of raw link capacity is needed in a typical system System typically organized as:

a small set of centralized content acquisition sites (head-ends); large number of media distribution sites in metropolitan cities; Redundant set of routers and a number of servers at

distribution sites a metro and neighborhood area network to reach the home

Uses IP multicast for distribution PIM-SSM (source specific mode) is the multicast protocol used Per “channel” tree from source (central acquisition) to

receivers Typically a group extends all the way to the consumer

3


Backbone Failures IPTV and other multimedia performance requirements

are very stringent E.g., ITU requirements for packet loss probability for video

distribution is less than 10^-8 Failures in a long distance backbone are not rare Even multiple failures are not rare.. Depending solely on Layer 3 recovery from a failure

can take from tens of seconds up to several minutes For example:

IGP can take tens of seconds to reconverge Timers are set conservatively, in the interest of stability and

scalability PIM typically refreshes (and thus reconverges) its tree on the

order of minutes Such recovery times are not tolerable

Recovery times greater than 50-100 msecs are difficult to treat using FEC and Resilient UDP

4


Existing Failure Restoration Approaches

Link-level Fast Re-route (FRR) – pure layer 2 approach Idea: Reroute traffic on the backup path of a failing link IGP and PIM are not informed about the failure Pros: Higher layers are not bothered/aware of failure being restored;

local decision; fast restoration (primarily failure detection time) ~50 msecs

Cons: Traffic overlaps and hence significant loss are possible Overlaps can last a long time (until failure is repaired) – several hours

0

1 4

2

5

3

New tree after failure

Old tree before failure

Multicast source

11

1

1

1

1

5

FRR path for 3-4

5



Depend on pure Layer 3 mechanisms PIM Rejoin – a pure multicast layer approach:

A “passive” approach with standard PIM timers. Each PIM router resends a join on the upstream interface periodically, every 30secs or more, to refresh soft state.

IGP is exposed to the failures. Pros: Standard definition of multicast. No need for extra

implementation complexity. Cons: When FRR is not used, significant loss takes place. When

FRR is used, traffic overlaps can occur. During switchover to the new tree significant loss can occur.

We solve these issues without

causing any significant state or

messaging overhead.

6



FRR + IGP: Careful setting of IGP link weights Idea: Set IGP link weights such that overlaps are avoided Again, IGP and PIM are not bothered with failures Pros: It is feasible to find such link weights for single

failures [INFOCOM’07] Cons: Overlaps are still possible for multiple failures

0

1 4

2

5

3



Multicast source

15

11

1

1

5

FRR path for 3-4

Our method can work over multiple failures and

minimizes the likelihood of overlap.

7


Multiple Failures

None of the existing approaches can reasonably handle multiple failures.

Multiple failures can cause FRR traffic to overlap. PIM must be informed about the failures and should

switchover to the new tree as soon as it is possible. So that overlaps due to multiple failures are minimized.

No single failure causes an overlap. But a double failure does..0

1 4

2

5

3

Old tree before failures

Multicast source

1 3

1

3

1

1

1

FRR path for 1-3

FRR path for 1-2

8


Our Approach: FRR + IGP + PIM

Key contributions of our approach: It guarantees reception of all data packets even after a

failure (except the packets in transit) – hitless It can be initiated when a failure is detected locally by the

router and does not have to wait until routing has converged network-wide – works with local rules

It works even if the new upstream router is one of the current downstream routers – prevents loops during switchover

FRR support(e.g., MPLS)

IGP routing(e.g., OSPF)

Multicast protocol(e.g., PIM-SSM)

Link failure/recovery

Lay

er 3

Lay

er 2

Routing changes

9


IGP-aware PIM: Key Ideas

Our key ideas as “local” rules for routers: Rule #1: Expose link failure to IGP routing even though FRR backup path is

in use. Rule #2: Notify multicast protocol that IGP routing has changed so that it

can reconfigure whenever possible. PIM will evaluate and see if any of its (S,G) upstream nodes has changed. If so,

it will try sending a join to the new upstream node. Two possibilities: #2.a New upstream node is NOT among current downstream nodes Just send the

join immediately. #2.b New upstream node is among current downstream nodes Move this (S,G) into

“pending join” state by marking a binary flag. Do not remove the old upstream node’s state info yet.

Rule #3: Prune the old upstream only after data arrives on the new tree. Send prune to the old upstream node when you receive a data packet from the

new upstream node. Remove the old upstream node’s state info.

Rule #4: Exit from the transient “pending join” state upon prune reception.

When a prune arrives from a (currently downstream) node on which there is a “pending join”, then:

Execute the prune normally. Send the joins for all (S,G)s that have been “waiting-to-send-join” on the sender of the

prune.

Very minimal additional multicast

state.

10


IGP-aware PIM Switchover: A sample scenario, No FRR yet

0

1 4

2

5

3


Old tree before failureMulticast source

15

11

1

1

5

joinprune

Node 4: detects the routing change after

SPF and tries to send a join message to 2 (#2)

moves to “pending join” state (#2.b)

Node 2: hears about the failure via IGP

announcements and does SPF detects the routing change after

SPF and tries to send a join message to 1 (#2)

sends the join to 1 (#2.a) but does not install the 21

interface yet

Node 1: receives the join message from 2 adds the 12 downstream

interface and data starts flowing onto the new tree

Node 2: receives data packets from new

tree and sends a prune to old upstream node (#3)

Node 4: receives prune from 2 and moves

out of “pending join” state by sending the join to 2 (#4)

processes the received prune Node 2:

receives the join message from 4 adds the 24 downstream

interface and data starts flowing onto the new tree

join

11


FRR Support Congested Common Link

Issue: Congested Common Link CL might experience congestion and data packets on the new tree

(blue) might never arrive at node 4 Solution: Allow CLs, but prioritize the traffic on the new tree

After link failure, mark the data traffic on the new tree with a higher priority and FRR packets with lower priority.

When there is FRR support, common links (i.e., overlaps) may happen.

Common Link (CL): During a switchover, the new tree might overlap with the FRR path of

the link that failed.

CL: Common Link0

1 4

2

5

3



Multicast source

15

11

1

1

5

12


Experimental Setup

ns-2 simulation of OSPF as the IGP, PIM-SSM as the multicast, and MPLS for FRR support

Comparative evaluation of: PIM-SSM Only

The standard IP multicast with PIM rejoin PIM-SSM w/ FRR

Only FRR is used for restoration IGP-aware PIM-SSM w/ FRR

Our multicast tree switchover protocol IGP-aware PIM-SSM w/ FRR – Priority

Our multicast tree switchover protocol with low-priority forwarding of FRR traffic

13


Experimental Setup (cont’d)

Link weight setting: Equal Link Weights vs. Intelligent Link Weights

120ms buffer time, 5secs spfDelayTime, and 10secs spfHoldTime

30secs of PIM rejoin time UDP multicast traffic with 70% link load Observed hit time and lost packets during reconvergence

following failure(s) Single failures: Failed each link on the tree and reconvergence is

observed for 30secs Double failures: First failure at 10th sec and second at 50th sec.

First failed link recovers at 200th sec and the second recovers at 250th sec. System is observed from seconds 10 thru 250.

14



Intelligent Link Weights Node 13 is the multicast source

Topology A: Hypothetical US backbone: 28 nodes, 45 links

15



Intelligent Link Weights Node 16 is the multicast source

Topology B: Exodus (from Rocketfuel): 21 nodes, 36 links

16


Simulation Results: Single Failures

“PIM-SSM only” experience outages of tens of seconds - unacceptable

IGP aware PIM with FRR has about the same time for “hit” as a single failure recovery time with FRR

Failure detection time dominates

Equal Link Weights do not change the results significantly

0.01

0.1

1

10

100

1 3 5 7 9 11 13 15 17 19 21 23 25 27

Receiver

Max

Hit

Tim

e (s

eco

nd

s)

0.01

0.1

1

10

100

1 3 5 7 9 11 13 15 17 19 21 23 25 27

PIM-SSM w/ FRR IGP-aware PIM-SSM w/ FRR IGP-aware PIM-SSM w/ FRR (Priority) PIM-SSM only

Intelligent Link Weights

Topology A

Topology B

17


0.01

0.1

1

10

100

1 3 5 7 9 11 13 15 17 19 21 23 25 27

Receiver

Max

Lo

st P

acke

ts (

%)

0.01

0.1

1

10

100

1 3 5 7 9 11 13 15 17 19 21 23 25 27

Receiver

Max

Lo

st P

acke

ts (

%)

Simulation Results: Single Failures

With a pure Layer 3 (PIM-SSM only) solution, far too many packets are lost

Layer 2 recovery mechanisms like FRR help significantly

Our IGP aware mechanism introduces no further hits and can handle unwisely set link weights0.01

0.1

1

10

100

1 3 5 7 9 11 13 15 17 19 21 23 25 27


Topology A


Equal Link Weights

18


0.01

0.1

1

10

100

1000

1 3 5 7 9 11 13 15 17 19 21 23 25 27

Receiver

Max

Hit

Tim

e (s

eco

nd

s)

0.01

0.1

1

10

100

1 3 5 7 9 11 13 15 17 19 21 23 25 27

Receiver

Ave

rag

e H

it T

ime

(sec

on

ds)

Simulation Results: Double Failures

With a pure Layer 2 (PIM-SSM w/ FRR) solution, 10 seconds of hit takes place

Our IGP aware mechanism reduces the average hit time to below 100ms.

Results are similar for Topology B or Equal Link Weights.0.01

0.1

1

10

100

1 3 5 7 9 11 13 15 17 19 21 23 25 27


Topology A, Intelligent Link WeightsMaximum Hit

Average Hit

19


0

10

2030

40

50

60

7080

90

100

1 3 5 7 9 11 13 15 17 19 21 23 25 27

ReceiverM

ax L

ost

Pac

kets

(%

)

0

10

2030

40

50

60

7080

90

100

1 3 5 7 9 11 13 15 17 19 21 23 25 27

Receiver

Max

Lo

st P

acke

ts (

%)


With a pure Layer 2 (PIM-SSM w/ FRR) solution, packet loss can be 100% during recovery from failures

Our IGP aware mechanism significantly reduces the packet loss

Results are similar for Topology B.0.01

0.1

1

10

100

1 3 5 7 9 11 13 15 17 19 21 23 25 27


Topology A


Equal Link Weights

20



With a pure Layer 2 (PIM-SSM w/ FRR) solution, packet loss is about 10% during recovery from failures

Our IGP aware mechanism significantly reduces the packet loss

Results are similar for Topology B.0.01

0.1

1

10

100

1 3 5 7 9 11 13 15 17 19 21 23 25 27


Topology B


Equal Link Weights

21


Summary

A method to make PIM-SSM re-convergence aware of the underlying network failure conditions.

The method allows Fast Reroute support at the link layer.

The method is much better than previous approaches in handling

unwisely set IGP link weights multiple failures

Proofs are in the paper.

22


Thank you!

THE END

Cross-Layer Techniques for Failure Restoration of IP Multicast with Applications to IPTV

Documents