Understanding OSPFv2 Understanding OSPFv2 and BGP4 and BGP4 Interactions Using Interactions Using Efficient Experiment Efficient Experiment Design Design David Bauer David Bauer† , Murat Yuksel , Murat Yuksel‡ , Christopher Carothers , Christopher Carothers† and Shivkumar Kalyanaraman and Shivkumar Kalyanaraman‡ † Department of Computer Science Department of Computer Science ‡ Department of Electrical, Computer and Systems Department of Electrical, Computer and Systems Engineering Engineering Rensselaer Polytechnic Institute Rensselaer Polytechnic Institute
13
Embed
A Case Study in Understanding OSPFv2 and BGP4 Interactions Using Efficient Experiment Design
A Case Study in Understanding OSPFv2 and BGP4 Interactions Using Efficient Experiment Design. David Bauer † , Murat Yuksel ‡ , Christopher Carothers † and Shivkumar Kalyanaraman ‡ † Department of Computer Science ‡ Department of Electrical, Computer and Systems Engineering - PowerPoint PPT Presentation
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
A Case Study in A Case Study in Understanding OSPFv2 Understanding OSPFv2 and BGP4 Interactions and BGP4 Interactions
Using Efficient Using Efficient Experiment Design Experiment Design David BauerDavid Bauer††, Murat Yuksel, Murat Yuksel‡‡, Christopher Carothers, Christopher Carothers†† and and
Shivkumar KalyanaramanShivkumar Kalyanaraman‡‡
††Department of Computer ScienceDepartment of Computer Science‡‡Department of Electrical, Computer and Systems Department of Electrical, Computer and Systems
EngineeringEngineering
Rensselaer Polytechnic InstituteRensselaer Polytechnic Institute
ROSS.Net built and utilized to address both parts of the ROSS.Net built and utilized to address both parts of the problemproblem Goal: “good results fast” leading to an understanding of Goal: “good results fast” leading to an understanding of the system under test (make sense of the results)the system under test (make sense of the results)
BG
P4
Response SurfaceResponse SurfaceO
SPFv
2
Understand protocol interactions Understand protocol interactions through UPDATE messages through UPDATE messages generated by and between protocolsgenerated by and between protocols
Why Are Feature Interactions Why Are Feature Interactions Harmful?Harmful?
Network protocol weaknesses are not fully Network protocol weaknesses are not fully understand until implemented / simulated understand until implemented / simulated in the large-scalein the large-scale
Are decisions made to efficiently route data Are decisions made to efficiently route data withinwithin a domain adversely affecting our a domain adversely affecting our ability to efficiently route data ability to efficiently route data acrossacross the the domain?domain?
Hot-potato routing: small degree of unstable Hot-potato routing: small degree of unstable information affects large portion of trafficinformation affects large portion of traffic
Cold potato routingCold potato routing
AS 0 AS 1 AS 2
Local Policy: optimize routing within AS (OSPFv2)Local Policy: optimize routing between ASes (BGP4)Global Policy: optimize routing within and between ASes
Large-scale SimulationLarge-scale Simulation Topology from Rocketfuel dataTopology from Rocketfuel data Network Hierarchy:Network Hierarchy:
– Level 0 routers: Level 0 routers: 9.92 Gb/sec and 1 ms delay9.92 Gb/sec and 1 ms delay– Level 1 routers: Level 1 routers: 2.48 Gb/sec and 2 ms delay2.48 Gb/sec and 2 ms delay– Level 2 routers: Level 2 routers: 620 Mb/sec and 3 ms delay620 Mb/sec and 3 ms delay– Level 3 routers: Level 3 routers: 155 Mb/sec and 50 ms delay155 Mb/sec and 50 ms delay– Level 4 routers: Level 4 routers: 45 Mb/sec and 50 ms delay45 Mb/sec and 50 ms delay– Level 5 routers and below: Level 5 routers and below: 1.55 Mb/sec and 1.55 Mb/sec and
50 ms delay50 ms delay
LEVEL 3: AS 3356iBGP: 7,921 eBGP: 210
OSPFv2:Routers: 2,064Links: 8,669
Tiscali: AS 3257iBGP: 441 eBGP:
OSPFv2:Routers: 618Links: 839
EBONE: AS 1755iBGP: 16,384
OSPFv2:Routers: 438Links: 1,192
EXODUS: AS 3967iBGP: 50,176 eBGP: 53
OSPFv2:Routers: 688Links: 2,166
ABOVENET: AS 6461iBGP: 2,500 eBGP: 199
OPSFv2: Routers: 843Links: 2,667
8
18
12
161
6
9
26
12
11
12
Experiment Design and Experiment Design and AnalysisAnalysis
Three classes of protocol Three classes of protocol parameters:parameters:– OSPF timers, BGP timers, BGP OSPF timers, BGP timers, BGP
decisiondecision
RRS was allowed 200 trials to RRS was allowed 200 trials to optimize (minimize) response optimize (minimize) response surfacesurface– Heuristic search algorithmHeuristic search algorithm
Applied multiple linear Applied multiple linear regression regression analysis analysis on the resultson the results
Response PlaneResponse Plane
Intra-domain routing decisions can Intra-domain routing decisions can effect inter-domain behavior, and effect inter-domain behavior, and vice versa.vice versa.
All updates belong to either of four All updates belong to either of four categories:categories:– OSPF-caused OSPF (OO) updateOSPF-caused OSPF (OO) update– OSPF-caused BGP (OB) update – interactionOSPF-caused BGP (OB) update – interaction– BGP-caused OSPF (BO) update – interactionBGP-caused OSPF (BO) update – interaction– BGP-caused BGP (BB) updateBGP-caused BGP (BB) update
Destination
OB Update
8 10
Link failure or cost increase (e.g. maintenance)
Intra-domain routing decisions can Intra-domain routing decisions can effect inter-domain behavior, and effect inter-domain behavior, and vice versa.vice versa.
All updates belong to either of four All updates belong to either of four categories:categories:– OSPF-caused OSPF (OO) updateOSPF-caused OSPF (OO) update– OSPF-caused BGP (OB) updateOSPF-caused BGP (OB) update– BGP-caused OSPF (BO) updateBGP-caused OSPF (BO) update– BGP-caused BGP (BB) updateBGP-caused BGP (BB) update
Response PlaneResponse Plane
eBGP connectivity
becomes available
Destination
BO Update
These interactions cause route changes to thousands of IP prefixes, i.e. huge traffic shifts!!
High Level CharacterizationHigh Level Characterization
Optimized with respect to Optimized with respect to OB+BO OB+BO response surface.response surface. BGP timersBGP timers play the major role, i.e. ~15% improvement in the play the major role, i.e. ~15% improvement in the
optimal response.optimal response.– BGP KeepAlive timer seems to be the dominant parameter.. – in BGP KeepAlive timer seems to be the dominant parameter.. – in
contrast to expectation of MRAI!contrast to expectation of MRAI! OSPF timers effect little, i.e. at most 5%.OSPF timers effect little, i.e. at most 5%.
– low time-scale OSPF updates do not effect BGP. low time-scale OSPF updates do not effect BGP.
~15% improvement when BGP timers included in search space
Varied response surfaces -- Varied response surfaces -- equivalent to a particular management approach.equivalent to a particular management approach. Importance of parameters differ for each metric.Importance of parameters differ for each metric. For minimal total updates:For minimal total updates:
– Local perspectives are 20-25% worse than the global.Local perspectives are 20-25% worse than the global. For minimal total interactions:For minimal total interactions:
– 15-25% worse can happen with other metrics15-25% worse can happen with other metrics OB updates are more important than BO updates (i.e. ~0.1% vs. ~50%)OB updates are more important than BO updates (i.e. ~0.1% vs. ~50%)
Important to optimize OSPFImportant to optimize OSPFImportant to optimize OSPFImportant to optimize OSPF
OB: ~50% of total updates
BO: ~0.1% of total updates
Global perspective 20-25% better than local perspectives
Minimize total BO+OB 15-25% better than other metrics
Q: Can we use this approach to Q: Can we use this approach to provide guidance for network routing provide guidance for network routing policies?policies?
Performed full factorial of RRS Performed full factorial of RRS searches, turning Hot-, Cold-potato searches, turning Hot-, Cold-potato routing ON/OFFrouting ON/OFF
Provide quantitative results from Provide quantitative results from which qualitative stmts can be madewhich qualitative stmts can be made
Verified AT&T and Sprint Verified AT&T and Sprint measurementsmeasurements
Design 2: Hot- v Cold-Potato Design 2: Hot- v Cold-Potato RoutingRouting
No major impact regardless of search performed
Majority of UPDATEs were generated by LOCAL-Pref and AS Path length
MED was << 1% of UPDATEs
Hot Potato was 0.8%
Larger question: Larger question: Which steps in Which steps in the BGP decision making the BGP decision making algorithm are most important?algorithm are most important?
Q: Can we use this approach to provide network Q: Can we use this approach to provide network admins with guidance for network configurations?admins with guidance for network configurations?
Link status varied with uniform random probability Link status varied with uniform random probability over simulation runtimeover simulation runtime
Link weights varied with uniform random Link weights varied with uniform random probability over simulation runtimeprobability over simulation runtime
Response: BO + OB, Global Persp, and Default Response: BO + OB, Global Persp, and Default network settingsnetwork settings
By maximizing link failure detection times, UPDATEs most effectively minimized
ConclusionsConclusions
– Number of experiments were reduced by many orders of Number of experiments were reduced by many orders of magnitude in comparison to Full Factorial magnitude in comparison to Full Factorial
– Experiment design and statistical analysis enabled rapid Experiment design and statistical analysis enabled rapid elimination of insignificant parameterselimination of insignificant parameters
– Several qualitative statements and system characterizations Several qualitative statements and system characterizations could be obtained with few experiments.could be obtained with few experiments.
– Provided validation of network measurement community Provided validation of network measurement community results, and called into question importance of premisesresults, and called into question importance of premises
– Search algorithms do not always find desired behaviourSearch algorithms do not always find desired behaviour
! Allowed me to complete my thesis and graduate!Allowed me to complete my thesis and graduate!