August 17, 2006Innovations in Measurement Science
1
Science for Networks k
Xj
j kjdN
E2
),(1
Measurement Science for Measurement Science for Complex Information SystemsComplex Information Systems
K. Mills, C. Dabrowski, V. Marbukh, K. Mills, C. Dabrowski, V. Marbukh, F. Hunt and J. FillibenF. Hunt and J. Filliben
August 17, 2006August 17, 2006Innovations in Measurement ScienceInnovations in Measurement Science
Internet map from the OPTE project - http://www.opte.org/maps/
August 17, 2006Innovations in Measurement Science
2
Science for Networks k
Xj
j kjdN
E2
),(1
What are complex systems? Large collections of interconnected components whose
interactions lead to macroscopic behaviors
http://autoinfo.smartlink.net/quake/quake.htmhttp://autoinfo.smartlink.net/quake/quake.htm
http://www.sover.net/~kenandeb/fire/hotshot.htmlhttp://www.sover.net/~kenandeb/fire/hotshot.html
http://www.avalanche.org/http://www.avalanche.org/
– Physical systems (e.g., earthquakes, avalanches, forest fires)
© http://www.nationalgeographic.com/© http://www.nationalgeographic.com/
©http://emergent.brynmawr.edu 2003©http://emergent.brynmawr.edu 2003
– Biological systems (e.g., slime molds, ant colonies, embryos)
http://www.wtopnews.com/http://www.wtopnews.com/
http://www.english.uiuc.edu/maps/depression/photoessay.htmhttp://www.english.uiuc.edu/maps/depression/photoessay.htm
©http://www.waag.org/realtime/©http://www.waag.org/realtime/
– Social systems (e.g., transportation networks, cities, economies)
http://www.emulab.net/pix/pc3k-back.jpghttp://www.emulab.net/pix/pc3k-back.jpg
http://www.kk.org/thetechnium/images/aol-server-farm.jpghttp://www.kk.org/thetechnium/images/aol-server-farm.jpg
– Information systems (e.g., Internet and Web services)
August 17, 2006Innovations in Measurement Science
3
Science for Networks k
Xj
j kjdN
E2
),(1
What is the problem? No one understands how to measure, predict or control macroscopic behavior in complex information systems
“[Despite] society’s profound dependence on networks, fundamental knowledge about them is primitive. [G]lobal communication … networks have quite advanced technological implementations but their behavior under stress still cannot be predicted reliably.… There is no science today that offers the fundamental knowledge necessary to design large complex networks [so] that their behaviors can be predicted prior to building them.”
— Network Science 2006, recently released NRC report
– threatening our nation’s security– costing billions of dollars
August 17, 2006Innovations in Measurement Science
4
Science for Networks k
Xj
j kjdN
E2
),(1
How is it solved today?
AnalyzingSpatiotemporal
Properties
Predicting PhaseTransitions
VisualizingMacroscopic
Evolution
Network Science
Little work
ControllingGlobal
Behavior
Network Test Facilities
Emulab
DETER
Very active
GENI
TeraGrid
National Lamda Rail
Autonomic Computing
Network Technology
Very active
IPv6 Transition
Service-OrientedArchitectures
Mobile andWireless Devices
Peer-to-PeerServices
VisualizingTopologies
AnalyzingSelf-Similarity
ArchivingTraffic Samples
EstimatingNetwork
Conditions
Network Measurement
Some work
August 17, 2006Innovations in Measurement Science
5
Science for Networks k
Xj
j kjdN
E2
),(1
Technical Approach• Establish models and analysis methods
– Computationally tractable– Reveal macroscopic behavior– Establish causality
• Characterize distributed control techniques– Economic mechanisms to elicit desired behaviors– Biological mechanisms to organize components
Leverage models and mathematics from the physical sciences to define a systematic method to measure, understand, predict and control macroscopic behavior in the Internet and distributed software systems built on the Internet
Internet
Distributed Systems
Scheduler
TaskControl
NegotiationControl
DSI
Grid Processor
Service Negotiator
Agreement
Grid ProcessorGrid Processor
DSIService
NegotiatorAgreement
Execution Control
CLIENT
Application
Client Negotiator
Task 1
DiscoveryControl
Task 2
Grid Processor
DSF
DSIService
Negotiator
Agreement
Client Negotiator
Task 3
TaskControl
NegotiationControl
ApplicationTask 1
DiscoveryControl
Task 2
Scheduler
DRMS Front-End
DRMS Front-End
DRMS Front-End
DRMS Front-End
DSF
spawnsspawns
negotiatesnegotiates
monitors
monitors
requests reservation
spawns
Supervisory Process Supervisory Process
spawns
GIIS
GRIS
GIIS
GRIS
GIIS
Scheduler
TaskControl
NegotiationControl
DSI
Grid Processor
Service Negotiator
Agreement
Grid ProcessorGrid Processor
DSIService
NegotiatorAgreement
Execution Control
CLIENT
Application
Client Negotiator
Task 1
DiscoveryControl
Task 2
Grid Processor
DSF
DSIService
Negotiator
Agreement
Client Negotiator
Task 3
TaskControl
NegotiationControl
ApplicationTask 1
DiscoveryControl
Task 2
Scheduler
DRMS Front-End
DRMS Front-End
DRMS Front-End
DRMS Front-End
DSF
spawnsspawns
negotiatesnegotiates
monitors
monitors
requests reservation
spawns
Supervisory Process Supervisory Process
spawns
GIIS
GRIS
GIIS
GRIS
GIIS
What is the new idea?
August 17, 2006Innovations in Measurement Science
6
Science for Networks k
Xj
j kjdN
E2
),(1
Why is this hard?Valid computationally tractable models that exhibit macroscopic
behavior and reveal causality are difficult to devise
Atmosphere WindMolecules
Physical Systems
Atmosphere WindMolecules Atmosphere WindMolecules
Physical Systems
Packets Internet Congestion
Information Systems
Packets Internet CongestionPackets Internet Congestion
Information Systems
August 17, 2006Innovations in Measurement Science
7
Science for Networks k
Xj
j kjdN
E2
),(1
Why is this hard?
unordered equilibrium(self-organized criticality)
oscillation
chaos
turbulence
high winds hurricanephysical world:
high load congestion collapseinformation system:
Phase-transitions are difficult to predict and control
August 17, 2006Innovations in Measurement Science
8
Science for Networks k
Xj
j kjdN
E2
),(1
Who would care? All designers and users of networks and distributed
systems with a 25-year history of unexpected failures
Internet throughput
self-similarities
0.00
0.10
0.20
0.30
Time
Prob
abili
ty
grid job completions
unexpected behaviorsmetastabilities
distribution of call types inwireless cells
– ARPAnet congestion collapse of 1980
– Internet congestion collapse of Oct 1986
– Cascading failure of AT&T long-distance network in Jan 1990
– Collapse of AT&T frame-relay network in April 1998 …
synchronization among Internet routers
phase transitions
August 17, 2006Innovations in Measurement Science
9
Science for Networks k
Xj
j kjdN
E2
),(1
Who would care?• “Cost of eBay's 22-Hour Outage Put At $2 Million”, Ecommerce, Jun 1999 • “Last Week’s Internet Outages Cost $1.2 Billion”, Dave Murphy, Yankee
Group, Feb 2000 • “Microsoft scrambled to find … cause of … extensive outage that blocked
traffic to … major Web sites”, Rachel Konrad, CNET News, Jan 2001 • “…the Internet "basically collapsed" Monday”, Samuel Kessler, Symantec,
Oct 2003
• “widespread Internet outage hit customers of Salesforce.com … making it impossible … to access critical data”, Bill Snyder, Dec 2005
• “Network crashes … cost medium-sized businesses a full 1% of annual revenues”, Technology News, Mar 2006
• “costs to the U.S. economy … range … from $65.6 M for a 10-day [Internet] outage at an automobile parts plant to $404.76 M for … failure … at an oil refinery”, Dartmouth study, Jun 2006
August 17, 2006Innovations in Measurement Science
10
Science for Networks k
Xj
j kjdN
E2
),(1
Who would care tomorrow? Designers and users of tomorrow's information systems
that will adopt dynamic adaptation as a design principle
*Net-Centric Enterprise Services initiatives in DOD ($13 B next 5 yrs)
– Service-Oriented Architectures (DISA*)
– Autonomic Computing Systems (IBM)
Requesting User'sLaptop/PDA
Sensor Net Gateway
MultiFunctionalSensor Node
MultiFunctionalSensor Node
MultiFunctionalSensor Node
MultiFunctionalSensor Node
MultiFunctionalSensor Node
MultiFunctionalSensor Node
MultiFunctionalSensor Node
MultiFunctionalSensor Node
MultiFunctionalSensor Node
MultiFunctionalSensor Node
– Sensor and Mobile Ad-Hoc Networks (DOD)– Distributed Robot Teams (DHS)
August 17, 2006Innovations in Measurement Science
11
Science for Networks k
Xj
j kjdN
E2
),(1
Who would care tomorrow?
• Market derived from Web services to reach $34 billion by 2010 IDC
• Grid computing market to exceed $12 billion in revenue by 2007 IDC
• Market for wireless sensor networks to reach $5.3 billion in 2010 ON World
• Revenue in mobile networks market will grow to $28 billion in 2011 Global Information, Inc.
• Market for service robots to reach $24 billion by 2010 International Federation of Robotics
August 17, 2006Innovations in Measurement Science
12
Science for Networks k
Xj
j kjdN
E2
),(1
NIST Groundwork
Complex Dynamics in Communications Networks, December 2005(including Macroscopic Dynamics in Large-Scale Data Networks by Yuan and Mills)
Yuan and Mills, A Cross-Correlation-based Method forSpatial-Temporal Traffic Analysis, July 2005
Yuan and Mills, Monitoring the Macroscopic Effects of DistributedDenial of Service (DDoS) Flooding Attacks, October 2005
Yuan and Mills, Simulating Timescale Dynamics of NetworkTraffic Using Homogeneous Modeling, May-June 2006
Preliminary investigation to identify hard technical issues
August 17, 2006Innovations in Measurement Science
13
Science for Networks k
Xj
j kjdN
E2
),(1
Why is this hard? Why can we succeed?
Hard Issues Plausible ApproachesH1. Model scale A1. Abstract models
H2. Model validation A2. Key comparisons
H3. Tractable analysis A3. Homogeneous models
H4. Causal analysis A4. High-dimension techniques
H5. Controlling behavior A5. Distributed control regimes
August 17, 2006Innovations in Measurement Science
14
Science for Networks k
Xj
j kjdN
E2
),(1
H1. Spatiotemporal Scale
• Systems of interest (e.g., Internet and compute grids) extend over large spatiotemporal extent– Global reach, millions of components, interacting through many
adaptive mechanisms over various timescales
• Which computational models can achieve sufficient spatiotemporal scaling properties?– Micro-scale models not computable at large spatiotemporal scale– Macro-scale models computable might exhibit global behavior,
but can they reveal causality?– Meso-scale models might exhibit global behavior and reveal
causality, but are they computable?
August 17, 2006Innovations in Measurement Science
15
Science for Networks k
Xj
j kjdN
E2
),(1
A1. Abstract Models• Investigate abstract models from the physical sciences
– Fluid flows (from hydrodynamics)– Lattice automata (from gas chemistry)– Boolean networks (from biology)– Agent automata (from geography)
• Apply parallel computing to scale to millions of components and days of simulated time– PVODE (LLNL)– CARPET/CAMEL (ISI-CNR, Italy)– JavaParty (Karlsruhe, Germany)– Scalable Simulation Framework (Dartmouth)
• Compute cycles from ITL/PL computing clusterG ~ (K; S, TS; L, ML; R, NR)
August 17, 2006Innovations in Measurement Science
16
Science for Networks k
Xj
j kjdN
E2
),(1
H2. Model Validation
• Scalable models from the physical sciences tend to be highly abstract– e.g., differential equations, cellular automata, nk-Boolean nets
• Can sufficient fidelity be obtained to convince domain experts of the value of insights gained from such abstract models?
“Simulation carries with it the risk of using a model simplified to the point where key facets of Internet behavior have been lost, in which case any ensuing results could be useless (though they may not appear to be so).” Floyd and Paxson
August 17, 2006Innovations in Measurement Science
17
Science for Networks k
Xj
j kjdN
E2
),(1
A2. Key Comparisons
• Data Comparisons– Existing traffic and analysis
• Model Comparisons– Subset macro/meso-scale models
and compare against micro-scale models
• Real Comparisons – Simulations of distributed control regimes
compared against implementations intest facilities
GENI
ns-2
August 17, 2006Innovations in Measurement Science
18
Science for Networks k
Xj
j kjdN
E2
),(1
H3. Tractable Analysis
• Scale of potential measurement data expected to be very large – O(1015)– Millions of elements, tens of variables, millions of seconds
• How can measurement data be analyzed tractably?
August 17, 2006Innovations in Measurement Science
19
Science for Networks k
Xj
j kjdN
E2
),(1
• Homogeneous models allow one (or few) elements to be sampled as representative of all (reducing data volume to 106 – 107)
• Amenable to statistical analyses, e.g.,– Power-spectral density– Wavelets– Entropy– Kolmogorov complexity
• Amenable to visualization
A3. Homogeneous Models
1
5
9
13
17
21
25
29
33
37
41
45
49
53
57
61
65
69
73
77
81
85
S1
S6
S11
S16
S21
S26
0
0.005
0.01
0.015
0.02
Price
Time
DSF
Price vs. Time (1000 s Intervals) Threshold -0.5
0.015-0.02
0.01-0.015
0.005-0.01
0-0.005
August 17, 2006Innovations in Measurement Science
20
Science for Networks k
Xj
j kjdN
E2
),(1
H4. Causal Analysis
• Tractable analysis strategies yield coarse data– Limited granularity of timescales, variables and spatial extents
• Coarseness may reveal macroscopic behavior that is not explainable from the data – e.g., Unexpected collapse in probability density function of job
completion times in a computing grid was unexplainable without more detailed data and analysis
0.00
0.10
0.20
0.30
Time
Pro
babi
lity
August 17, 2006Innovations in Measurement Science
21
Science for Networks k
Xj
j kjdN
E2
),(1
A4. High-Dimension Techniques• Multidimensional analysis
– represent system state as a multidimensional space and depict system dynamics through various projections (e.g., slicing, aggregation, scaling)
• State-space dynamics– segment system dynamics into attractor-basin field and monitor
trajectoriesslicing/scaling
attractor-basin field slicing/aggregation
August 17, 2006Innovations in Measurement Science
22
Science for Networks k
Xj
j kjdN
E2
),(1
• Large distributed systems and networks cannot be subjected to centralized control regimes– Too many elements, too many parameters,
too much change, too many policies
• Can models and analysis methods be used to determine how well decentralized control regimes stimulate desirable system-wide behaviors?
element topology
configuring one element
H5. Controlling Behavior
August 17, 2006Innovations in Measurement Science
23
Science for Networks k
Xj
j kjdN
E2
),(1
A5. Distributed Control Regimes (1)
Use price feedback to modulate supply and demand for resources or services– Auctions (e.g., Waldspurger, et al. – VMware) – Present-Value Analysis (e.g., Irwin, et al. – Duke)– Commodity Markets (e.g., Wolski, et al. – UCSB)
0
50
100
150
200
250
300
350
400
0 2000 4000 6000 8000 10000
Time
0
50
100
150
200
250
300
0 2000 4000 6000 8000
Wolski Figure 3. Estimated Smale’s Equilibrium Price
Pric
e
Disk
CPU
Time
0
50
100
150
200
250
300
0 2000 4000 6000 8000
Wolski Figure 3. Estimated Smale’s Equilibrium Price
Pric
e
Disk
CPU
Auction Price vs. Time Market Price vs. Time
1
5
9
13
17
21
25
29
33
37
41
45
49
53
57
61
65
69
73
77
81
85
S1
S6
S11
S16
S21
S26
0
0.005
0.01
0.015
0.02
Price
Time
DSF
Price vs. Time (1000 s Intervals) Threshold -0.5
0.015-0.02
0.01-0.015
0.005-0.01
0-0.005
Present Value vs. Space and Time
August 17, 2006Innovations in Measurement Science
24
Science for Networks k
Xj
j kjdN
E2
),(1
A5. Distributed Control Regimes (2) Use biological processes to differentiate function based
on environmental feedback– Morphogen gradients, chemotaxis, local and lateral inhibition,
polarity inversion, quorum sensing used for topology formation, leader election and robustness (Nagpal – Harvard)
– Energy exchange and reinforcement used to create and maintain population of services and composition relations among services (Suda – UC Irvine)
Gradients Energy Exchange EquationsLocality PolarityInversion
August 17, 2006Innovations in Measurement Science
25
Science for Networks k
Xj
j kjdN
E2
),(1
Why NIST?Scientifically challenging metrology problem
– Model millions of interacting Turing machines– Understand and control macroscopic behaviors – Predict and mitigate phase transitions– Characterize validity of abstract models– Devise tractable analysis methods– Establish causality of macroscopic behaviors
With direct applications to the nation’s critical cyberinfrastrcuture– Predict ramifications of changes in technology and usage– Identify vulnerabilities to failures and attacks– Improve predictability and robustness
Expanding NIST leadership and competence– Leverage knowledge in mathematics, statistics, computer sciences
and physical sciences– Increase scientific strength within ITL– Open new opportunities in the study of complex systems
August 17, 2006Innovations in Measurement Science
26
Science for Networks k
Xj
j kjdN
E2
),(1
What is the impact? Who cares?
Looming (billions of dollars) investment in components without requisite scientific knowledge to deploy them in a system providing predictable behavior
HELP!
HELP!
HELP!
Growing interdependence between information infrastructure and society’s well-being and prosperity
25-year history of outages related to unexpected behaviors costing many billions of dollars
Increasing instance of attacks on the nation’s cyberinfrastructure will have unforeseen results
August 17, 2006Innovations in Measurement Science
27
Science for Networks k
Xj
j kjdN
E2
),(1
Project Plan
August 17, 2006Innovations in Measurement Science
28
Science for Networks k
Xj
j kjdN
E2
),(1
Why can we succeed?
Kevin Mills [PhD] (Senior Research Scientist and Project Leader)
Christopher Dabrowski [MS] (Computer Scientist – Grid Computing and Economic Control Regimes)
Vladimir Marbukh [PhD] (Mathematician – Internet and Economic Control Regimes)
Fern Hunt [PhD] (Mathematician – Analysis Methods for Dynamic Systems)
James Filliben [PhD] (Statistician – Exploratory Data Analysis)
Post-Doc #1 (Mathematician - Internet Fluid Flow Models)
Daniel Genin [PhD] from Penn State has applied
Post-Doc #2 (Computer Scientist – Grid Computing and Biological Models)
Guest Scientist #1 (Computer Scientist – Grid Computing and Biological Models)
Great team!
August 17, 2006Innovations in Measurement Science
29
Science for Networks k
Xj
j kjdN
E2
),(1
Leverage work from NIST physical scientistsGarnett Bryant [PhD] – PL (Physicist – Modeling Macro-scale Optical Properties Emerging from Nanostructures)
Jack Douglas [PhD] – MSEL (Physicist – Modeling Phase Transitions)
Anne Chaka [PhD] – CSTL (Chemist – Modeling and Design of Smart Materials)
Edward Garboczi [PhD] - BFRL (Physicist – Advanced Material Science Simulations and Visualizations)
Emil Simiu [PhD] - BFRL - (Civil Engineer – Failure of Complex Structural Systems)
+++ others at NIST
Why can we succeed?
August 17, 2006Innovations in Measurement Science
30
Science for Networks k
Xj
j kjdN
E2
),(1
Committed External CollaboratorsStephen Bush [PhD] – GE Corporate Research & Development (Computer Scientist – Complex Systems)
Jian Yuan [PhD] – Associate Professor, Tsinghua University (Electrical Engineer – Internet and Complex Systems)
Key Contacts in Government and IndustryNetworking and Information Technology Research & DevelopmentAgencies – NSF, DOE, DOD, NASA, NOAA, NIH, NIST
Internet Engineering Task Force, World-wide Web Consortium andOpen Grid Forum
Why can we succeed?
August 17, 2006Innovations in Measurement Science
31
Science for Networks k
Xj
j kjdN
E2
),(1
How will the program be organized?
FY 07 FY 08 FY 09 FY 10 FY 11
Develop Internet & Web Services
Models
Develop DataRepository
Validate InternetModels
Devise and Test Analysis & Visualization Methods for
Macroscopic Behavior
Validate Web Services Models
Document Modeling &
Analysis Framework
Devise & Test Causality Models
Validate Internet Control Models
Validate Web ServicesControl Models
Augment Internet Models
with Control Regimes
Augment Web Services Models
with Control Regimes
Write & Publish NIST SP on
Measurement Approach for
Complex Information
Systems
August 17, 2006Innovations in Measurement Science
32
Science for Networks k
Xj
j kjdN
E2
),(1
Heilmeier Summary• What is the problem?
– Lack of scientific understanding necessary to measure, predict, and control global behaviors in computer networks and distributed systems
• Why is this hard?– System complexity and model computability and validity + data volume and
granularity • How is this solved today?
– No solutions exist today• What is the new idea?
– Leverage models and analysis methods from the physical sciences• Why can we succeed?
– We have the people + models and analysis methods exist + public-domain parallel software packages available + compute cluster available
• Who would care?– All designers and users of distributed computer systems and networks
• Why should NIST do this?– No science exists to measure and control complex information systems– National cyberinfrastructure is a critical resource– NIST has the people and resources to meet this responsibility
August 17, 2006Innovations in Measurement Science
33
Science for Networks k
Xj
j kjdN
E2
),(1
Context
August 17, 2006Innovations in Measurement Science
34
Science for Networks k
Xj
j kjdN
E2
),(1
State-of-the-Art: Technology
• Looming Internet transition– Internet Protocol Version 6
• Moving toward dynamic distributed systems– Service-Oriented Architectures– Grid Computing– Autonomic Computing
• Changing usage patterns– Mobile and wireless access – Multimedia– Peer-to-Peer
Underlying components and usage patterns in constant flux
August 17, 2006Innovations in Measurement Science
35
Science for Networks k
Xj
j kjdN
E2
),(1
State-of-the-Art: Measurement
• Visualizing Network Topologies– Cooperative Association for Internet Data Analysis (UCSD)– Network Tomography (many universities)
• Sampling, Archiving and Analyzing Traffic– The Internet Traffic Archive (LBNL)– National Laboratory for Applied Network Research (UCSD)– Center for Internet Research (UCB)– PREDICT Program (DHS)
• Estimating Current Network Conditions– Network Weather Service (UCSB)– The Internet Traffic Report (Opnix)
Lack of knowledge regarding what to measure and why
August 17, 2006Innovations in Measurement Science
36
Science for Networks k
Xj
j kjdN
E2
),(1
State-of-the-Art: Test Beds
• Medium-scale virtual test beds– Emulab (and derivatives)– DETER (funded by NSF and DHS)– PlanetLab Consortium
• Optical network test beds– National Lambda Rail – National Transparent Optical Network
• Grid test beds– TeraGrid, Open Science Grid and ATLAS
• Large-scale virtual test bed– Global Environment for Network Initiatives (GENI - pending NSF
initiative that is already soliciting multi-agency participation)
Tension between controlled test environments and real world
August 17, 2006Innovations in Measurement Science
37
Science for Networks k
Xj
j kjdN
E2
),(1
State-of-the-Art: Science
• Investigating Spatial Structure– Scale-free topologies (Barabasi vs. Doyle)
• Investigating Temporal Structure– Long-range dependence (Willinger, Feldman, Faloutsos)
• Investigating Spatiotemporal Structure– Fluid flow models (Towsley and Liu)
– Cellular automata models (Csabi, Yuan, Ohira, Sole) • Primitive state of understanding
– Models limited (Internet protocols only) and not validated – Controversies about causalities: application traffic, adaptive
protocols, network topology, coupled interactions – No method to study decentralized control techniques
Primitive state of knowledge and understanding