A Program of Work for Understanding Emergent A Program of Work for Understanding Emergent Behavior in Global Grid Systems Behavior in Global Grid Systems Chris Dabrowski & Kevin Mills National Institute of Standards and Technology February 13, 2006
Jan 03, 2016
A Program of Work for Understanding Emergent A Program of Work for Understanding Emergent Behavior in Global Grid SystemsBehavior in Global Grid Systems
Chris Dabrowski & Kevin Mills National Institute of Standards and Technology
February 13, 2006
March/2006 2
What are emergent behaviors?
Why are emergent behaviors likely in global grids?
Can emergent behaviors be elicited or controlled?
How are NIST researchers investigating these questions?
Case study: denial-of-service (DoS) attack on simulated grid
Outline
March/2006 3
What are emergent behaviors?
Emergent behaviors are coherent system-wide properties
that cannot be deduced directly fromanalyzing behavior of individual components
Emergent behaviors typically arise in dynamic open complex adaptive systems,where system-wide behavior derives from
self-organizing interactions among myriad components
March/2006 4
Some Dynamic Open Complex Adaptive Systems
http://www.sover.net/~kenandeb/fire/hotshot.htmlhttp://www.sover.net/~kenandeb/fire/hotshot.html http://autoinfo.smartlink.net/quake/quake.htmhttp://autoinfo.smartlink.net/quake/quake.htm http://www.wtopnews.com/http://www.wtopnews.com/
http://www.avalanche.org/http://www.avalanche.org/
http://www.ics.uci.edu/relations/develop/rs2001/teitelbaum/sld012.htmhttp://www.ics.uci.edu/relations/develop/rs2001/teitelbaum/sld012.htm
http://www.english.uiuc.edu/maps/depression/photoessay.htmhttp://www.english.uiuc.edu/maps/depression/photoessay.htm
© M.F. Schatz and J.L.Rogers 1998© M.F. Schatz and J.L.Rogers 1998
© http://www.nationalgeographic.com/© http://www.nationalgeographic.com/ ©http://www.waag.org/realtime/©http://www.waag.org/realtime/
©http://emergent.brynmawr.edu 2003©http://emergent.brynmawr.edu 2003
March/2006 5
How might a complex system be detected?
fractal patterns
http://www.mbfractals.com/usergal/dougowen.html
fractal patterns
http://www.mbfractals.com/usergal/dougowen.html
self-similarity
http://www.physionet.org/tutorials/fmnc/node3.html
self-similarity
http://www.physionet.org/tutorials/fmnc/node3.html
ON = ParetoOFF = exponential
linear waveletsON = ParetoOFF = exponentialON = ParetoOFF = exponential
linear wavelets
1/f noise© J. Davidsen and H.G. Shuster 2000
1/f noise© J. Davidsen and H.G. Shuster 2000
http://complexity. orcon.net.nz/powerlaw .html http://heseweb.nrl.navy.mil/gamma/solarflare/24mar00.htm
power laws
http://complexity. orcon.net.nz/powerlaw .html http://heseweb.nrl.navy.mil/gamma/solarflare/24mar00.htm
power laws
Other ideas include: decrease in entropy or changes in statistical complexity
March/2006 6
What characteristics might lead to a complex system?
System Scale – order emerges from many interactions over space and time
Communications Locality – inability to know global state
Element Simplicity – inability to process all possible states
Feedback – elements can sense environment and estimate global state
Element Autonomy – each element can vary its behavior based on feedback
March/2006 7
Why are emergent behaviors likely in global grids? Scale: large number of clients and services interacting via indirect
coupling arising through use of shared resources
Communications Locality: clients cannot obtain complete and timely state of all resources – decisions must be made on partial information
Element Simplicity: clients possess limited processing power – decisions must be made with heuristics
Feedback: clients learn fate of resource requests and adapt subsequent requests based on updated information
Element Autonomy: clients decide how to proceed with no central control or direction
March/2006 8
Can emergent behaviors be elicited or controlled? Remains an open research question, for example:
– NASA exploring emergent programming to increase adaptability and survivability of future spacecraft (see Kenneth N. Lodding, “Hitchhikers Guide to Biomorphic Software”, ACM Queue vol. 2, no. 4) – MIT exploring amorphous computing where systems structure and specialize themselves from a common set of components (http://www.swiss.csail.mit.edu/projects/amorphous)
– Radhika Nagpal (Harvard) studying how to engineer and understand self- organizing systems (http://www.eecs.harvard.edu/~rad)
– Several researchers exploring application of economic mechanisms, such as markets, auctions, and present-value calculations, as means to elicit effective behavior in distributed systems
March/2006 9
How are NIST researchers investigating these questions?
1
51
101
151
201
251
301
351
401
451
501
551
601
651
701
751
801
851
901
951
1001
1051
1101
1151
1201
1251
1301
1351
1401
1451
S1
S7
S13
S19
S25
0
5
10
15
20
25
30
35
40
45
50
No
. Get
Tem
pla
te R
equ
ests
Time DSF
NSS 90 % Load
Goals • Understand self-organizing properties in service-oriented architectures (SOA)• Investigate mechanisms to shape emergent behavior in SOA• Improve related consortia specifications w.r.t. robustness, reliability, performance
Technical Approach • Apply modeling and analysis techniques from the physical sciences• Exploit exploratory data analysis and visualization methods• Investigate control techniques from biology and economics
Space-Time-State Evolution
Project Phases • Micro-model: 103 to 104 elements based on selected industry specs• Macro-model: 104 to 106 agent-based model containing selected abstractions validated against micro-model
March/2006 10
Architecture of Global Compute Grid
1. PUBLISH PROVIDER DESCRIPTION
2. DISCOVER PROVIDERS
PROVIDERSITE
CLUSTER
SERVER
SCHEDULERJOB
MANAGER
LOCALDIRECTORY
3. GET AVAILABILITY
4. NEGOTIATE USAGE
5. TRANSFER INPUT DATA
6. MONITOR EXECUTION
CLIENT
GLOBAL DIRECTORY
INTERNET
March/2006 11
Micro-model conception Layered Component Architecture
Network Layer: sites located in (x,y,z)-space used to compute distance in hops and simulate transmission delays;TCP-like simulated transport protocol; nodes model CPU delays, buffer & port capacity
Basic Web Services: WS- Addressing and Messaging WSRF: WS- Resource Property, Lifetime, Notification, Topics, Service Group Grid Services: MDS v4, WS-Agreement, and DRMAA
Major Grid Entities Service Providers: negotiate, schedule, execute, and monitor client tasks on
vector or cluster computers maintained at a related site Clients: discover providers, rank discoveries by earliest availability, seek
agreements, submit & monitor jobs
Client Grid Applications Application types: workflows of n sequential tasks, each with parallelizable sub-
computations – dependent tasks may not start until preceding task completes Tasks: types defined by tuple (required code, task parallelism, compute cycles)
and matched to processor component with suitable code and parallelism Workload: represented as a percentage of system capacity – regulated by
assignment of applications to clients
March/2006 12
Schematic showing operation of simulated grid
Scheduler
TaskControl
NegotiationControl
DSI
Grid Processor
Service Negotiator
Agreement
Grid ProcessorGrid Processor
DSIService
NegotiatorAgreement
Execution Control
CLIENT
Application
Client Negotiator
Task 1
DiscoveryControl
Task 2
Grid Processor
DSF
DSIService
Negotiator
Agreement
Client Negotiator
Task 3
TaskControl
NegotiationControl
ApplicationTask 1
DiscoveryControl
Task 2
Scheduler
DRMS Front-End
DRMS Front-End
DRMS Front-End
DRMS Front-End
DSF
spawnsspawns
negotiatesnegotiates
monitors
monitors
requests reservation
spawns
Supervisory Process Supervisory Process
spawns
GIIS
GRIS
GIIS
GRIS
GIIS
ProviderSite
ClientSite
Provider Site
March/2006 13
Case Study: DoS Attack on Simulated Grid• Deploy simulated topology: 200 nodes covering 30 provider sites and
12 clients, where each client uses one of two negotiation strategies• Negotiation strategies: serial reservation requests (SRR) or
concurrent reservation requests (CRR)• Run baseline: 50% workload for 200,000 simulated seconds and
measure the distribution of job completion times• Repeat run: inject service-provider spoofing with probability 50%,
effectively reduces system capacity by half on average• Repeat run: identical spoofing but introduce a strategy to resist
spoofing: identify spoofers and do not repeat interactions with them
1. Which negotiation strategy is more effective under normal conditions?2. Does the outcome change under attack?3. Does the outcome change when resisting attack?
Three Questions of Interest
March/2006 14
Bottom Line
1. CRR performs slightly better than SRR under normal conditions
2. CRR performs significantly better than SRR under attack scenario
3. Surprise: both CRR and SRR perform worse when resisting attackand the performance of CRR deteriorates more than SRR
The surprise arises because scheduling and execution of jobs inthe global grid is an emergent behavior arising from a self-organizing
property of distributed resource-management algorithms
March/2006 15
0.00
0.10
0.20
0.30
1000
0
2000
0
3000
0
4000
0
5000
0
6000
0
7000
0
8000
0
9000
0
1000
00
1100
00
1200
00
1300
00
1400
00
1500
00
1600
00
1700
00
1800
00
1900
00
2000
00
>200
000
Time
Pro
bab
ility
Comparative distribution of application completion times for two negotiation strategies (over 200+ repetitions)
Serial Reservation Requests (SRR) vs. Concurrent Reservation Requests (CRR) with No Spoofing
Concurrent Reservation Requests
Serial Reservation Requests
March/2006 16
Performance Degradation caused by Spoofing in Grid where 50% clients use SRR and 50% use CRR
0.00
0.10
0.20
0.30
Time
Pro
babi
lity
(a) No Spoofing
(b) Spoofing without Resistance
(c) Spoofing with Resistance (SURPRISE)
Comparative distribution of application completion times: (a) No Spoofing, (b) Spoofing without Resistance, and (c) Spoofing with Resistance (200+ repetitions)
March/2006 17
Decomposing performance degradation caused by spoofing
0.00
0.10
0.20
0.30
1000
0
2000
0
3000
0
4000
0
5000
0
6000
0
7000
0
8000
0
9000
0
1000
00
1100
00
1200
00
1300
00
1400
00
1500
00
1600
00
1700
00
1800
00
1900
00
2000
00
>200
000
Time
Pro
bab
ility
Spoofing without Resistance for CRR
Spoofing without Resistance for SRR
Spoofing with Resistance for CRR
Spoofing with Resistance for SRR
March/2006 18
0
50
100
150
200
250
3001 7 13 19 25 31 37 43 49 55 61 67 73 79 85 91 97 103
109
115
121
127
133
139
145
151
157
163
169
175
181
187
193
199
Time
Am
pli
tud
e
Two Time Series: (a) Reservations Created without Resistance and (b) Reservations Created with Resistance – 50% clients SRR and 50% CRR
Aggregate Reservations Created over Time under Spoofing with and without Resistance
(a) Without Resistance
(b) With Resistance
March/2006 19
Time Series for Application/Task Completions: Two Application Types without Resistance (lower blue) vs. with Resistance (upper red)
Time
Am
plitu
de
s
Time
Am
plitu
de
s
Serial Reservation Requests (SRR) Concurrent Reservation Requests (CRR)
Ap
plic
atio
n 1
Ap
plic
atio
n 2
Task2
Task1
Task3
Task2
Task1
Time
Am
plitu
de
s
Task2
Task1
Ap
plic
atio
n 1
Time
Am
plitu
de
sTask3
Task2
Task1Ap
plic
atio
n 2
Earlier
Earlier
Later
Later
Later
Later
March/2006 20
Conclusions Global Grids will be dynamic open complex adaptive
systems with self-organizing properties leading to emergent behaviors
Changes made to behavior in individual components could have pervasive and unexpected effects on global behavior
We need to develop a science of complex information systems in order to predict and control macroscopic behavior