June 26, 2006 TeraGrid A National Production Cyberinfrastructure Facility Scott Lathrop TeraGrid Director of Education, Outreach and Training University of Chicago and Argonne National Laboratory [email protected] www.teragrid.org
Jan 13, 2016
June 26, 2006
TeraGridA National Production
Cyberinfrastructure FacilityScott Lathrop
TeraGrid Director of Education, Outreach and TrainingUniversity of Chicago and Argonne National Laboratory
www.teragrid.org
June 26, 2006
TeraGrid: Integrating NSF Cyberinfrastructure
SDSCTACC
UC/ANL
NCSA
ORNL
PU
IU
PSC
TeraGrid is a facility that integrates computational, information, and analysis resources at the San Diego Supercomputer Center, the Texas Advanced Computing Center, the University of Chicago / Argonne National Laboratory, the National Center for Supercomputing Applications, Purdue University, Indiana University, Oak Ridge National Laboratory, the Pittsburgh Supercomputing Center, and the National Center for Atmospheric Research.
NCAR
Caltech
USC-ISI
UtahIowa
Cornell
Buffalo
UNC-RENCI
Wisc
June 26, 2006
TeraGrid Vision
• TeraGrid will create integrated, persistent, and pioneering computational resources that will significantly improve our nation’s ability and capacity to gain new insights into our most challenging research questions and societal problems.
–Our vision requires an integrated approach to the scientific workflow including obtaining access, application development and execution, data analysis and management, and collaboration.
June 26, 2006
TeraGrid Objectives
• DEEP Science: Enabling Petascale Science–Make Science More Productive through an integrated set of very-high capability resources
•Address key challenges prioritized by users
• WIDE Impact: Empowering Communities–Bring TeraGrid capabilities to the broad science community
•Partner with science community leaders and educators
• OPEN Infrastructure, OPEN Partnership–Provide a coordinated, general purpose, reliable set of services and resources
•Partner with campuses and grids
June 26, 2006
TeraGrid DEEP SCIENCE Objectives Enabling Petascale Science
• Make Science More Productive through an integrated set of very-high capability resources–Address key challenges prioritized by users
• Ease of Use: TeraGrid User Portal–Significant and deep documentation and training improvements–Addresses user tasks related to allocations, accounts
• Breakthroughs: Advanced Support for TeraGrid Applications (ASTA)–Hands-on, “Embedded” consultant to help teams bridge a gap
•Seven user teams have been helped•Eight user teams currently receiving assistance•Five proposed projects with new user teams
• New Capabilities driven by user surveys–WAN Parallel File System for remote I/O (move data only once!)–Enhanced workflow tools (added GridShell, VDS)
June 26, 2006
ANL/UC IU NCSA ORNL PSC Purdue SDSC TACC
ComputationalResources
Itanium 2(0.5 TF)
IA-32(0.5 TF)
Itanium2(0.2 TF)
IA-32(2.0 TF)
Itanium2(10.7 TF)
SGI SMP (7.0 TF)
Dell Xeon(17.2TF)
IBM p690(2TF)
Condor Flock(1.1TF)
IA-32 (0.3 TF)
XT3 (10 TF)
TCS (6 TF)
Marvel(0.3 TF)
Hetero(1.7 TF)
IA-32(11 TF)
Itanium2(4.4 TF)
Power4+(15.6 TF)
Blue Gene(5.7 TF)
IA-32(6.3 TF)
Online Storage 20 TB 32 TB 1140 TB 1 TB 300 TB 26 TB 1400 TB 50 TB
Mass Storage 1.2 PB 5 PB 2.4 PB 1.3 PB 6 PB 2 PB
DataCollections
5 Col.
>3.7 TB
URL/DB/GridFTP
> 30 Col.
URL/SRB/DB/GridFTP
4 Col.
100GB-6TB
SRB/Portal/OPeNDAP
>70 Col.
>1 PB
GFS/SRB/DB/GridFTP
4 Col.
2.35 TB
SRB/Web Services/URL
Instruments Proteomics
X-ray Cryst.
SNS and HFIR Facilities
VisualizationResourcesRI: Remote Interact
RB: Remote Batch
RC: RI/Collab
RI, RC, RB
IA-32, 96
GeForce
6600GT
RB
SGI Prism, 32 graphics pipes; IA-32
RI, RB
IA-32 + Quadro4 980 XGL
RB
IA-32, 48 Nodes
RB RI, RC, RB
UltraSPARC IV, 512GB SMP, 16 gfx cards
TeraGrid Resources
Over 100 TeraFlops in Computing Resources
June 26, 2006
TeraGrid Usage
33% Annual Growth
PACI Systems
June 26, 2006
TeraGrid PI’s By Institution as of May 2006
TeraGrid PI’s
Blue: 10 or more PI’sRed: 5-9 PI’sYellow: 2-4 PI’sGreen: 1 PI
June 26, 2006
TeraGrid User Community
0
50
100
150
200
250
300
FY04 FY05 FY06 (8 mos)
160 DAC proposals in FY06 continues strong growth in new users investigating the use of TeraGrid for their science.
The Development Allocations Committee (DAC) accepts requests to develop applications, experiment with TeraGrid platforms, or use TeraGrid systems for classroom instruction.
June 26, 2006
Ease of Use: TeraGrid User Portal
•Account Management–Manage my allocation(s)–Manage my credentials–Manage my project users
•Information Services–TeraGrid resources & attributes–job queues–load and status information
• Documentation–User Info documentation–contextual help for interfaces
• Consulting Services–help desk information–portal feedback channel
• Allocation Services–How to apply for allocations–Allocation request/renewal
Eric Roberts ([email protected])
June 26, 2006
Advanced Support for TeraGrid Applications
Project PI (Inst) NSF Div Status TeraGrid StaffArterial Karniadakis (Brown U) CTS completed O’Neal (GIG/PSC)Vortonics Boghosian (Tufts U) CTS completed O’Neal (GIG/PSC)SPICE Coveney (UCL) CHE completed O’Neal (GIG/PSC)ENZO Norman (UCSD) AST completed Harkness (SDSC)Injector Heister (Purdue) ASC completed Kim (NCSA)MD-Data Jakobsson (UIUC) BIO completed Parker (NCSA)NREL Brady (Cornell) BIO completed Chukkapali (SDSC)SCEC Olsen (SDSU) GEO in progress Cul (SDSC), Reddy
(GIG/PSC)BIRN Ellisman (SDSC) BIO in progress Majumdar (SDSC)CMS Newman (Caltech) PHY in progress Milfeld (TACC)CIG Gurnis (Caltech) GEO in progress Gardner (PSC)EarthScope Pavlis (IU) GEO in progress Sheppard (IU)Crystal Deem (Rice) PHY in progress Walker (TACC)Tstorms Droegemeier (OU) ATM in progress O’Neal (GIG/PSC)Turbulence Woodward (U Minn) ASC in progress Reddy (PSC)Nemo3D Klimeck (Purdue) ENG proposed Raymond (GIG/PSC)Epidemiology Barrett (VaTech),
Cuticchia (Duke)BCS proposed Marcusiu (NCSA)
Pulmonary Immunity Benos (Pitt) BIO proposed Raymond (GIG/PSC)Demography Lansing (U Arizona) DBS proposed Majumdar (SDSC),
Gome (PSC)Multidimensional Microscope Imaging
Luby-Phelps (UT SW Med Ctr, Dallas)
BIO proposed Hempel (TACC)
● LSMS- locally self-consistent multiple scattering method is a linear scaling ab initio electronic structure method (Gordin Bell prize winner)
● Achieves as high as 81% peak performance of CRAY-XT3
Wang (PSC), Stocks, Rusanu, Nicholson, Eisenbach (ORNL), Faulkner (FAU)
Magnetic NanocompositesWang (PSC)
• Direct quantum mechanical simulation on Cray XT3.
• Goal: nano-structured material with potential applications in high density data storage: 1 particle/bit.–Need to understand influence
of these nanoparticles on each other.
• A petaflop machine would enable realistic simulations for nanostructures of ~ 50nm (~ 5M atoms).
June 26, 2006
Homogeneous turbulence driven by force of Arnold-Beltrami-Childress (ABC) form
VORTONICSBoghosian (Tufts)
• Physical challenges: Reconnection and Dynamos– Vortical reconnection governs establishment of
steady-state in Navier-Stokes turbulence
– Magnetic reconnection governs heating of solar corona
– The astrophysical dynamo problem. Exact mechanism and space/time scales unknown and represent important theoretical challenges
• Computational challenges: Enormous problem sizes, memory requirements, and long run times– requires relaxation on space-time lattice of 5-15
Terabytes.
– Requires geographically distributed domain decomposition (GD3): DTF, TCS, Lonestar
• Real time visualization at UC/ANL– Insley (UC/ANL), O’Neal (PSC), Guiang (TACC)
June 26, 2006
• Largest and most detailed earthquake simulation of the southern San Andreas fault.
• First calculation of physics-based probabilistic hazard curves for Southern California using full waveform modeling rather than traditional attenuation relationships.
• Computation and data analysis at multiple TeraGrid sites.• Workflow tools enable work at a scale previously
unattainable by automating the very large number of programs and files that must be managed.
• TeraGrid staff Cui (SDSC), Reddy (GIG/PSC)
Simulation of a magnitude 7.7 seismic wave propagation on the San Andreas Fault. 47 TB data set.
TeraShake / CyberShakeOlsen (SDSU), Okaya (USC)
Major Major Earthquakes Earthquakes on the San on the San
Andreas Fault, Andreas Fault, 1680-present1680-present
19061906M 7.8M 7.8 18571857
M 7.8M 7.816801680M 7.7M 7.7
June 26, 2006
Searching for New Crystal StructuresDeem (Rice)
• Searching for new 3-D zeolite crystal structures in crystallographic space
• Requires 10,000s of serial jobs through TeraGrid.
• Using MyCluster/GridShell to aggregate all the computational capacity on the TeraGrid for accelerating search.
• TG staff Walker (TACC) and Cheeseman (Purdue)
June 26, 2006
TeraGrid WIDE IMPACT Objectives Empowering Communities
• Bring TeraGrid capabilities to the broad science community–Partner with science community leaders - “Science Gateways”
• Science Gateways Program–Originally ten partners, now 21 and growing
•Reaching over 100 Gateway partner institutions (PIs)•Anticipating order of magnitude increase in users via Gateways
• Education, Outreach, and Training–National collaborations integrating TeraGrid resources
June 26, 2006
TeraGrid Science Gateways Initiative:Community Interface to Grids
• Common Web Portal or application interfaces (database access, computation, workflow, etc).
• “Back-End” use of TeraGrid computation, information management, visualization, or other services.
• Standard approaches so science gateways may readily access resources in any cooperating Grid without technical modification.
June 26, 2006
Science Gateway Partners
• Open Science Grid (OSG)• Special PRiority and Urgent Computing
Environment (SPRUCE, UChicago)• National Virtual Observatory (NVO,
Caltech)• Linked Environments for Atmospheric
Discovery (LEAD, Indiana)• Computational Chemistry Grid
(GridChem, NCSA)• Computational Science and
Engineering Online (CSE-Online, Utah)• GEON(GEOsciences Network) (GEON,
SDSC)• Network for Earthquake Engineering
Simulation (NEES, SDSC)• SCEC Earthworks Project (USC)• Astrophysical Data Repository (Cornell)• CCR ACDC Portal (Buffalo)
• Network for Computational Nanotechnology and nanoHUB (Purdue)
• GIScience Gateway (GISolve, Iowa)• Biology and Biomedicine Science
Gateway (UNC RENCI)• Open Life Sciences Gateway (OLSG,
UChicago)• The Telescience Project (UCSD)• Grid Analysis Environment (GAE,
Caltech)• Neutron Science Instrument Gateway
(ORNL)• TeraGrid Visualization Gateway (ANL)• BIRN (UCSD)• Gridblast Bioinformatics Gateway
(NCSA)• Earth Systems Grid (NCAR)• SID Grid (UChicago)
June 26, 2006
TeraGrid Science Gateway Partner Sites
TG-SGW-Partners
21 Science Gateway Partners (and growing) - Over 100 partner Institutions
June 26, 2006
TeraGrid Education, Outreach and Training
The mission is to engage larger and more diverse communities of researchers, educators and learners in discovering, using, and contributing to TeraGrid.
The goals are to:–Enable awareness and broader community
access to TeraGrid–Promote diversity among all activities–Foster partnerships to sustain and scale-
up best practices
June 26, 2006
K-12 Outreach
•Computational science workshops for K-12 teachers, and pre-service students
•GEMS to engage young girls in math and science
•Summer workshops for students - hands-on science
•SC05/06 Education Programs engage teachers and faculty
•SDSC TeacherTECH engages K-12 teachers and students with sustained interaction
•SDSC Data Portal incorporates scientific data into the curricula
June 26, 2006
Higher Education Outreach• Summer Grid workshop for undergraduate students • Computational science workshops - faculty and students
– E.g. Computational chemistry, computational biology, etc.
• MSI workshop - campus infrastructure • Developing Bioinformatics Programs• SC07-09 Education Programs engage faculty and students • Science Gateways in education
– LEAD science gateway education portal – nanoHub science gateway used in many campus courses
• HPC workshops • Research experiences for undergraduates (REUs)• On-line tutorials - CI topics and college courses• HASTAC workshop - HASTAC’s Information Year 2006-
2007
June 26, 2006
Community Outreach
• CIP seminars, CI-Channel — live webcasts and recorded sessions
• National conferences — Grace Hopper TeraGrid panel, SCxx annual conferences, etc.
• Science Impact stories — via web site, press releases, brochures
• TeraGrid Speaker’s Bureau — conferences, workshops, meetings
• Katrina: After the Storm — Civic Engagement Through Arts, Humanities and Technology, part of HASTAC’s Information Year 2006-2007
• Annual TeraGrid Conference — planning underway for 2007 in Washington, DC.
June 26, 2006
SC07-09 Education Program Goals
• Three-year (SC|07-09) Education Program to provide continuity and broader, sustained impact in education
• Increase participation of larger, more diverse communities in the SC Conference
• Integrate HPC into undergraduate science, technology, engineering and mathematics classrooms
• Recruiting Institutions NOW!
June 26, 2006
SC07-09 Education Program Year-round Activities
• Attend annual SC Conference• Week-long summer workshops distributed around
the country• Regular visits to institutions for workshops and
working with administrators• Mentoring of faculty and students• Course materials development• Posting of materials to ACM and NSF-NSDL digital
libraries• Committees to plan and organize events
June 26, 2006
TeraGrid OPEN ObjectivesInfrastructure and Partnerships
• Provide a coordinated, general purpose, reliable set of services and resources–Partner with grids and facilities
• Streamlined Software Integration–Evolved architecture to leverage standards, web services
• Campus Partnership Programs–User access, physical and digital asset federation, outreach
June 26, 2006
TeraGrid “Open” Initiatives
• Working with Campuses: toward Integrated Cyberinfrastructure–Access for Users: Authentication and Authorization–Additional Capacity: Integrated resources–Additional Services: Integrated data collections–Broadening Participation: Affiliates program with diverse institutions
• Technology Foundations–Security, Authentication, and Accounting–Service-based Software Architecture
June 26, 2006
Lower Integration Barriers; Improved Scaling
• Initial Integration: Implementation-based–Coordinated TeraGrid Software and Services (CTSS)
•Provide software for heterogeneous systems, leverage specific implementations to achieve interoperation.
•Evolving understanding of “minimum” required software set for users
• Emerging Architecture: Services-based–Core services: capabilities that define a “TeraGrid Resource”
•Authentication & Authorization Capability• Information Service•Auditing/Accounting/Usage Reporting Capability•Verification & Validation Mechanism
–Significantly smaller than the current set of required components.–Provides a foundation for value-added services.
•Each Resource Provider selects one or more added services, or “kits”•Core and individual kits can evolve incrementally, in parallel
June 26, 2006
Example Value-Added Service Kits
• Job Execution
• Application Development
• Science Gateway Hosting
• Application Hosting–dynamic service deployment
• Data Movement
• Data Management
• Science Workflow Support
• Visualization
June 26, 2006
PK YeungGeorgia Institute of Technology
Gerhard KlimeckPurdue University
Thomas CheathamUniversity of Utah
Gwen JacobsMontana State University
Luis LehnerLouisiana State University
Philip MaechlingUniversity of Southern California
Roy PeaStanford University
Alex RamirezHispanic Association of Colleges and Universities
Nora SabelliCenter for Innovative Learning Technologies
Patricia TellerUniversity of Texas - El Paso
Cathy WuGeorgetown University
Bennett BertenthalUniversity of Chicago
Cyberinfrastructure User Advisory Committee
June 26, 2006
TeraGrid: More Information
The TeraGrid facility is funded by the Office of Cyberinfrastructure at the National Science Foundation.
www.nsf.gov
Charlie Catlett ([email protected])June 26, 2006
Data Collections
Instruments & Sensors
Colleagues
Data Collections
Science/Education Portal