The Impact of Policy and Topology on Internet Routing Convergence NANOG 20 October 23, 2000 Abha Ahuja InterNap [email protected]*In collaboration with Roger Wattenhofer, Srinivasan Venkatachary, Madan Musuvathi Craig Labovitz Microsoft Research [email protected]om
18
Embed
The Impact of Policy and Topology on Internet Routing Convergence NANOG 20 October 23, 2000 Abha Ahuja InterNap [email protected] *In collaboration with.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
The Impact of Policy and Topology on Internet Routing Convergence
In NANOG 19, we showed BGP exhibits poor convergence behavior:
1) Measured convergence times of up to 20 minutes for BGP path changes/failures
2) Factorial (N!) theoretic upper bound on BGP convergence complexity (explore all paths of all possible lengths)
Open question: In practice, what topological and policy factors impact convergence delay ?
3
This Talk
Goal: Understand BGP convergence behavior under real topologies/policies
– Given a physical topology and ISP policies, can we estimate the time required for convergence?
– Do convergence behaviors of ISPs differ?– How does steady-state topology compare to paths
explored during failure?– Can we change policies/topology to improve BGP
convergence times?
4
Experiments
• Analyzed secondary paths between between 20 source/destination AS pairs– Inject and monitor BGP faults – Survey providers to determine policies behind paths
• To provide intuition, we will focus on faults injected into three ISPs at Mae-West– Observed faults via fourth ISP (in Japan)– Three ISPs roughly map onto tier1, tier2, tier3
providers– Results from these three ISPs representative of all data
5
Comparing ISP Convergence Latencies
• CDF of faults injected into three Mae-West providers and observed at Japanese ISP• Significant variations between providers• Not related to geography
6
Observed Fault Injection Topologies
• In steady-state, topologies between ISP1, ISP2, ISP3 similar – all direct BGP peers of ISP4. Does not explain variation on previous slide…
Steady State
ISP 1R1
FAULT
ISP 4
ISP 2R2
FAULT
Steady State
ISP 3R3
FAULT
Steady State
MAE-WEST
7
Factors Impacting BGP Propagation
• Topology and policy impact graph (usually DAG)
• Each AS router adds between 0-45 seconds of MinRouteAdver Delay
• iBGP/Route Reflector• MinRouteAdver and path race
conditions affect which routes chosen as backup routes
• customer/provider• customer sends their customer routes• provider sends default-free routing info (or default)
– Peer relationships• Bilateral exchange of customer routes
– Back-up transit• peer relationship becomes transit relationship based on failure
• These relationships constrain topology (no N! states) and determine number of possible backup paths
12
Convergence in the Real World
1
customer
peer
2
3
4
5
X
Longest path: 3 4 5 2 1
Possible paths for node 3:
2 1 x4 2 1 x(4 5 2 1 x)
Possible paths for node 4:
2 1 x3 2 1 x5 2 1 x
13
Convergence in the Real World
1
customer
peer
2
3
4
5
X
Longest path: 3 4 5 2 1
Possible paths for node 3:
2 1 x4 5 2 1 x
Possible paths for node 4:
3 2 1 x5 2 1 x
Hierarchy eliminates some states
Tier 1?
14
Policy and Convergence
• Strict hierarchical relationships eliminate exploring some extra states – Policy controls the number of possible paths to
explore.– But turns out the number of paths does not
matter…
15
Relationship Between Backup Paths and Convergence
• Convergence related to length longest possible backup ASPath between two nodes
Longest Observed ASPath Between AS Pair
16
So, what does all of this mean for convergence time?
• Convergence time is related to the length of the longest path that needs to be explored– Before fail-over, need to withdraw all
alternative paths– This is bounded O(n) by length of the longest
alternative path in the system– This longest path is related to policy
17
Towards Millisecond BGP Convergence
Three possible solutions
1) Entirely new protocol
2) Turn off MinRouteAdver timer
3) “Tag” BGP updates– Provide hint so nodes can detect bogus state
information
18
Further Information
C. Labovitz, R. Wattenhofer, A. Ahuja, S. Venkatachary, “The Impact of Topology and Policy on Delayed Internet Routing Convergence”. MSR Technical Report (number pending). June, 2000.
C. Labovitz, A. Ahuja, A. Bose, F. Jahanian, “Internet Delayed Routing Convergence.” To appear in Proceedings of ACM SIGCOMM. August, 2000.
Send email to [email protected] for more information or to participate in the policy survey