Cactus in GrADS (HFA) Ian Foster Dave Angulo, Matei Ripeanu, Michael Russell
Feb 02, 2016
Cactus in GrADS (HFA)
Ian Foster
Dave Angulo, Matei Ripeanu, Michael Russell
Presentation Outline
Introduction to Cactus Cactus Applications Cactus Architecture Cactus Worm Thorn Tequila Thorn (Cactus in GrADs) Tequila Architectur Issues
What is Cactus?
Cactus is a freely available, modular, portable and manageable environment for collaboratively developing parallel, high-performance multidimensional simulations
Example Cactus OutputExample output from Numerical Relativity Simulations
Cactus Applications
Application Thorns are Astrophysics applications
Calculate Schwartzchild Event Horizons for colliding black holes
Cactus Applications (cont.)
Candidate apps are Elliptical Solver or BenchADM
Abstract Topologies are simple 3D Grid
Cactus Applications (cont.)
Applications can easily be “linked” in with the other thorns used as tools.
Application Thorns are just selected and run with the other selected thorns
Cactus Architecture
Configure CST
Flesh
ComputationalToolkit
Toolkit Toolkit
Operating SystemsAIX NT
LinuxUnicos
SolarisHP-UX
Thorns
Cactus
SuperUX Irix
OSF
Make
Cactus Model (cont.)
Building an executable
Cactus Source
Flesh
IOBasic
IOASCII
WaveToy
LDAP
Worm
…
ThornsConfiguration
• Compiler options
• Tool options
• MPI options
• HDF5 options
Running Cactus
Parameter File
• Specify which thorns to activate
• Specify global parameters
• Specify restricted parameters
• Specify private parameters
Cactus model
This is the currently working Cactus application framework that we will modify
Cactus “flesh” internals
Cactus Application Thorn(s)
Other Cactus Thorn(s)
Cactus “Worm” Thorn
Worm Thorn Functions
Initiates moving to new resource when scheduled time is exhausted
Contacts IS to get a new node to run on Checkpoints application Restarts application on the new node Runs on single node only
GrADS Cactus Model
We will start with “Worm” thorn code to make new “Tequila” thorn (Apotheosized Cactus Worm).
Cactus “flesh” internals
Cactus Application Thorn(s)
Other Cactus Thorn(s)
Cactus “Tequila” Thorn
Tequila thorn functions
Receives event (generated by user) to initiate adapting resources.
Contacts ResourceSelector to get new bag of resources
Checkpoints application Restarts application on the new resources
Events
Events that cause the user to want to adapt resources:– User changes parameters during runtime that
requires additional resources• Example: starting an analysis routine
• Example: running an event horizon finder
– User specifies that performance is not meeting expectations
Future Events
Possible Future Plans for automatic resource adapting:– User changes parameters during runtime that
requires additional resources– Contract violations fire similar events
• we were wrong first time• resources get overloaded• more (or fewer) (or different) processors appear• distribution changed• resolution changed
Tequila thorn contacts ResourceSelector ResourceSelector must be set up as service Tequila thorn sends request for new bag of
resources ResourceSelector responds with the new
bag
Request and Response
The Request to the ResourceSelector will be stored in the InformationService
Only the pointer to the data in the IS will be passed to the ResourceSelector
The Response from the ResourceSelector will also be stored in the IS
Only the pointer to the data in the IS will be passed back.
Tequila communication overview
Cactus Tequila ThornResourceSelector
InformationService
Cactus Architecture in GrADS
Configure CST
Flesh
ComputationalToolkit
Toolkit
Operating SystemsAIX NT
LinuxUnicos
SolarisHP-UX
Thorns
Cactus
SuperUX Irix
OSF
Make
Toolkit
GradsCommuni-
cationlibrary
Open Issues
How does Contract Monitor fit into architecture?
How does PPS fit into architecture? How does COP and Aplication Launcher fit
into architecture (Cactus has its own launcher and compiles its own code)?
How does Pablo fit into architecture (Which thorns are monitored, is flesh monitored)?
End of Presentation
Slides Explaining Communication Details ********************
Communication Details step 1
Event sent to Tequila thorn requesting restart
Cactus Tqeuila ThornResourceSelector
InformationService
Communication Details step 2
Tequila store AART in IS
Cactus Tqeuila ThornResourceSelector
InformationService
Communication Details step 3
Tequila sends request to ResourceSelector passing pointer to data in IS
Cactus Tqeuila ThornResourceSelector
InformationService
Communication Details step 4
ResourceSelector retrieves AART from IS
Cactus Tqeuila ThornResourceSelector
InformationService
Communication Details step 5
ResourceSelector stores bag of resources (in AART) in IS
Cactus Tqeuila ThornResourceSelector
InformationService
Communication Details step 6
ResourceSelector responds to Tequila passing pointer to data in IS
Cactus Tqeuila ThornResourceSelector
InformationService
Communication Details step 7
Tequila retrieves AART with new bag of resources from IS
Cactus Tqeuila ThornResourceSelector
InformationService
Requirements
Using the IS for communication adds overhead.
Why do this? GrADS requirement 1: do some things (e.g.
compile) at one time and have the results stored in a persistent storage area. Pick these stored results up later and complete other phases.
Requirements (cont.)
GrADS requirement 2: Application people want to be able to allow users to manually interact in any of the "module interfaces." Tequila allows this to be done with a web client.
Slide Explaining Parallelism in Cactus ***************
Parallelism in Cactus Cactus is designed around a distributed memory model. Each thorn is passed
a section of the global grid.
The actual parallel driver (implemented in a thorn) can use whatever method it likes to decompose the grid across processors and exchange ghost zone information - each thorn is presented with a standard interface, independent of the driver.
Standard driver distributed with Cactus (PUGH) is for a parallel unigrid and uses MPI for the communication layer
PUGH can do custom processor decomposition and static load balancing
Slide with Alternate Tequila Architecture ***************
Sample Tequila Scenario User asks to run an ADM simulation 400x400x400 for
1000 timesteps in 10s. Resource selector contacted to obtain virtual machines Best virtual machine selected based on performance
model. AM starts Cactus on that virtual machine (and monitors
execution Contracts?) User (or application manager) decides that computation
advances too slow and decides to search for a better virtual machine
AM finds a better machine, commands the Cactus run to Checkpoint, transfers files and restart Cactus
Slides Explaining Different Tequila Architectures *********************
Tequila Architecture Choices
Main presentation explained the short term Tequila Architecture
Open issues covered not-yet-resolved architectural choices for longer term integration
Worm SpawningCactus Flesh
Worm Thorn
GIIS
1. ResourcesObtained
Cactus Flesh
Worm Thorn
Application
Manager
2. Application Manager instructed to spawn new instance
3. New instance spawned
Tequila Spawning
The short-term plan is to simply replace the GIIS with the UCSD Resource Selector
Tequila would make the request for new resources to the RS instead of the GIIS
Tequila Spawning
1. ResourcesObtained
Cactus Flesh
Tequila Thorn
UCSD Resource Selector
Cactus Flesh
Tequila Thorn
Application
Manager
2. Application Manager instructed to spawn new instance
3. New instance spawned
Tequila Spawning
Longer term plan is not yet resolved. One possibility is to put all grads pieces into
Application Manager
Application ManagerCactus Flesh
Tequila Thorn
UCSD Resource Selector
Cactus Flesh
Tequila Thorn
Application
Manager
1. Application Manager instructed to spawn new instance
3. New instance spawned
2. ResourcesObtained