TeraGrid Science Support Nancy Wilkins-Diehr San Diego Supercomputer Center Area Director for Science Gateways
Jan 14, 2016
TeraGrid Science Support
Nancy Wilkins-DiehrSan Diego Supercomputer Center
Area Director for Science Gateways
Talk Outline
•User Support– User Engagements– Advanced Support for TeraGrid Applications
(ASTA)
•Science Gateways– Initial projects– Deployment strategies– Preparation for expansion
•Education, Outreach and Training (EOT)
Questions answered in this presentation
• Has user engagement been effective?– How are user requirements investigated and defined?– How for uncertainty and change in user requirements managed?– How is usability evaluated, e.g., formatively and summatively?– How are applications prioritized for implementation?– What refinements or changes to 2.1-4 are envisaged?
• Has outreach been effective?– How is the potential for the wider take up of applications assessed?– How are applications being adapted for use by wider user communities?
• Has training been effective?– How is effectiveness being assessed?– What quality control measures are in place for training materials?– What refinements or changes to 4.1-2 are envisaged?
• Science Gateways: – The TeraGrid report refers to a document, a Science Gateway primer, that reports on general strategy for portal
deployment. The reference given is http://wg.teragrig.org/Gateways but this site is private (a password is needed). Please forward a copy of this document. We would like to be able to assess the maturity of the Science Gateways activities. Please provide appropriate information during the presentations.
• Are effective science portal building environments available to the user community?– If so, what is available? – I.e., what science portals that invoke simulations and/or manage massive data sets are in
operation across TeraGrid and used by discipline science communities?– If not, what is the progress toward this?
• Has a Grid/Web Services environment been established?– To what extent is it used by the science community?
• What cross connections / resource sharing have been made with other Grids?– How much effort and funds have been/will be invested in developing and testing inter-grid interoperability?
TeraGrid User Services
Sergiu SanieleviciPittsburgh Supercomputing Center
Area Director for User Services
Components of User Support
•24/7 Help desk integrating all sites•Training and tutorials•Extensive documentation•TeraGrid User Portal•User contact team•Intensive support
– ASTA– Science Gateways
•User Survey
TeraGrid User Portal Vision
• Integrate important user capabilities in one place:– Information services:
•Documentation, training, real time consulting
•Notification (news, MOTDs, next downtimes, etc.)
•Resources info, calendars, cross-site run scheduling
•Network info
– Account services•Allocation requests•Allocation management & usage reporting
•Accounts management (including setting up grid credentials)
– Interactive services•Job launching•File transfers•Linear workflow•Data mining
– Listing of and access to data collections
– Remote vis (interactive), and eventually collaborative
• With personalizability and customizability, it can be a foundation for application portals and (some) science gateways
Proactive Approach to Discovering and Meeting User Requirements
•User Contact team for each allocated LRAC/MRAC project
•Results in ability to understand, track and anticipate evolving needs of the users
•Codes specifically written or requested by allocated users receive highest installation priority
–Optimization, Scaling, I/O, ETF Network Utilization, Workflow mapping
–Overcoming application-level obstacles to portability and interoperation
–Resolving third-party package issues
•Intensive support for selected projects: ASTA Program
Plans for 2006
•Improve reach and quality of personalized, proactive user support system
•Improve tracking and logging of staff-user interactions
•Improve User Survey content, administration, and follow-up
•Work with external evaluators•Consider new tools e.g. User Forum
Advanced Support for TeraGrid Applications (ASTA)
•Inaugurated 6/1/05; 10 projects now underway•Already produced remarkable new science using
TG-deployed software … including the SC05 Analytics Challenge winner.
•Help users to:– Achieve their science objectives– Utilize TeraGrid resources interestingly and effectively
•Improve the quality of the TeraGrid infrastructure– Provide feedback to staff when testing, piloting and
exercising TeraGrid capabilities
•Selection by TG staff, NSF, PIs willing and able to assign developer time from within their project.
Simulation of Blood Flow in Human Arterial Tree on the TeraGrid
Supported by NSF and TeraGrid
Brown University:
Imperial College, London:
Argonne National Lab:
ASTA:
S. Dong, L. Grinberg, A. Yakhot, G.E. Karniadakis
S.J. Sherwin
N.T. Karonis, J. Insley, J. Binns, M. Papka
D.C. O’Neal, C. Guiang, J. Lim
Team Members
Simulating & Visualizing Human Arterial Tree
ANL
Viz servers
Viewer client
SC05, Seattle, WA
USA
UK
Computation Visualization
Flow data
Flow data
Flow d
ata
What ASTA Helps With
•NekTar development and porting•Mpich-G2 on heterogeneous platforms•Cross-platform access and “firefighting”•Visualization•Project coordination
CMS on the TeraGrid
Compact Muon Solenoid Experiment Large Hadron Collider
PI: Harvey Newman, CalTech
TeraGrid ASTA Team: Tommy Minyard, Edward Walker, Kent Milfeld, Jeff Gardner
• Simulations running simultaneously across multiple TeraGrid sites, SDSC, NCSA and TACC, using grid middleware tool, GridShell
• Complex workflow consisting of multiple execution stages running a large number of serial jobs (~1000s) with very large datasets stored on SDSC HPSS and staged to local sites prior to job runs
• Used 420K CPU hours on TeraGrid systems last year, usage expected to increase this and coming years
CMS experiment is looking for the Higgs particles, thought to be responsible for mass, and to find supersymmetry, a necessary element for String theory.
Currently running event simulations and reconstructions to validate methods prior to experimental data becoming available.
“Using the NSF TeraGrid for Parametric Sweep CMS Applications”, to appear in Proceedings of the International Symposium on Nuclear Electronics and Computing (NEC’2005), Sofia, Bulgaria, Sept. 2005
What TeraGrid Staff Helped With(pre-ASTA)
•GridShell development allows the TeraGrid to be used as a personal Condor pool– Condor jobs scheduled across multiple sites– Do not need shared architectures or queuing
systems– Makes use of TeraGrid protocols for data transfer– Fits into existing Teragrid software stack
•CMS production chain run through this system – 40,000 jobs– SC05 demo
Current ASTA Projects Span DisciplinesProject Discipline End
DateCellulose + Cellulase interactions using CHARMM, PI Brady Port, Scale and Optimize Code
Molecular Dynamics
3/31/2006
MD Data Repository, PI Jakobsson
Implementation of architectural components
Molecular Dynamics
3/31/2006
Liquid Rocket Engine Coaxial Injector Modeling, PI Heister Computational model development and implementation
Computational Fluid Dynamics
3/31/2006
NekTar Arterial Tree Simulations, PI Karniadakis Code porting and optimization; MPICH-G2 and visualization support
Computational Fluid Dynamics
3/31/2006
Vortonics: CFD with Vortex Degrees of Freedom, PI Boghosian
MPICH-G2 and visualization support
Computational Fluid Dynamics
3/31/2006
SPICE Non-Equilibrium Simulations, PIs Coveney and Boghosian Code deployment, grid and steering implementation support
DNA Modeling 3/31/2006
ENZO Cosmic Simulator, PI Norman
Code optimization and scaling,network data handling and archiving
Cosmology 3/31/2006
SCEC TeraShake-2 and CyberShake, PI Olsen
Code optimization, TG data handling and archiving, task flow mapping
Seismology 3/31/2006
CIG: Cyberinfrastructure for Geodynamics, PI Gurnis Develop software framework, repository, portal and training
Geophysics 5/31/2006
BIRN (Biomedical Informatics Research Network), PI Ellisman
Develop and optimize codes; map task flows to TG
Biomedical Imaging
9/30/2006
Proposed ASTA Candidates
Project DisciplineLEAD: Storm-Scale Forecasts and Library Atmospheric
modeling
CERN LHC support: CMS; ATLAS High energy physics
BNL RHIC experiment: STAR High energy physics
NanoHub: Nemo-3D Nanotechnology
NAMD-G Molecular Dynamics
PPM: Turbulent Astrophysical Flows, interactive simulations Astrophysics
TeraGrid Science Gateways
Nancy Wilkins-DiehrSan Diego Supercomputer Center
Area Director for User Services
Science GatewaysA new initiative for the TeraGrid
• Increasing investment by communities in their own cyberinfrastructure, but heterogeneous:
•Resources•Users – from expert to K-12•Software stacks, policies
• Science Gateways– Provide “TeraGrid Inside”
capabilities– Leverage community
investment• Three common forms:
– Web-based Portals – Application programs running
on users' machines but accessing services in TeraGrid
– Coordinated access points enabling users to move seamlessly between TeraGrid and other grids.
Technical Approach
Biomedical and Biology, Building Biomedical Communities
OG
CE
Sc
ien
ce
Po
rta
l
OGCE Portletswith ContainerOGCE Portletswith Container
Apache JetspeedInternal ServicesApache JetspeedInternal Services
ServiceAPI
ServiceAPI
GridProtocols
GridServiceStubs
GridServiceStubs
RemoteContentServices
RemoteContentServices
RemoteContentServersHTTP
GridService
s
Java
Co
G K
it
LocalPortal
Services
LocalPortal
Services
Grid Resources
Open Source Tools
Build standard portals to meet the domain requirements of the biology communitiesDevelop federated databases to be replicated and shared across TeraGrid
Workflow Composer
Initial Focus on 10 GatewaysListed in Program Plan
Science Gateway Prototype Discipline Science Partner(s) TeraGrid Liaison
Linked Environments for Atmospheric Discovery (LEAD)
Atmospheric Droegemeier (OU) Gannon (IU), Pennington (NCSA)
National Virtual Observatory (NVO)
Astronomy Szalay (Johns Hopkins) Williams (Caltech)
Network for Computational Nanotechnology (NCN) and “nanoHUB”
Nanotechnology Lundstrum (PU) Goasguen (PU)
Open Life Sciences Gateway Biomedicine and Biology Schneewind (UC), Osterman (Burnham/UCSD), DeLong (MIT), Dusko (INRA)
Stevens (UC/Argonne)
Biology and Biomedical Science Gateway
Biomedicine and Biology Cunningham (Duke), Magnuson (UNC)
Reed (UNC), Blatecky (UNC)
Neutron Science Instrument Gateway
Physics Cobb (ORNL) Cobb (ORNL)
Grid Analysis Environment High-Energy Physics Newman (Caltech) Bunn (Caltech)
Transportation System Decision Support
Homeland Security Stephen Eubanks (LANL) Beckman (Argonne)
Groundwater/Flood Modeling Environmental Wells (UT-Austin), Engel (ORNL) Boisseau (TACC)
Science Grid [GrPhyN/ivDGL/Grid3]
Multiple Pordes (FNAL), Huth (Harvard), Avery (Uflorida)
Foster (UC/Argonne), Kesselman (USC-ISI), Livny (UW)
Proposed Supplemental Activity: Empowering Science, Research, and DiscoveryRuss Miller, Mark Green, University of Buffalo
•Enabling scientific and engineering domain applications using Grid-enabling Application Templates (GATs), •Porting 16 applications per year as well as providing support in terms of training 20-30 research groups per year
So how will we meet all these needs?
• With RATS! (Requirements Analysis Teams)
• Collection, analysis and consolidation of requirements to jump start the work– Interviews with 10 Gateways– Common user models,
accounting needs, scheduling needs
• Summarized requirements for each TeraGrid working group– Accounting, Security, Web
Services, Software• Areas for more study identified• Primer outline for new
Gateways in progress
• And milestones
Implications for TeraGrid working groups
• Accounting– Support for accounts with
differing capabilities– Ability to associate compute job
to a individual portal user– Scheme for portal registration
and usage tracking– Support for OSG’s Grid User
Management System (GUMS)– Dynamic accounts
• Security– Community account privileges– Need to identify human
responsible for a job for incident response
– Acceptance of other grid certificates
– TG-hosted web servers, cgi-bin code
• Web Services – Initial analysis completed 12/05– Some Gateways (LEAD, Open Life
Sciences) have immediate needs– Many will build on capabilities
offered by GT4, but interoperability could be an issue
– Web Service security– Interfaces to scheduling and
account management are common requirements
• Software– Interoperability of software
stacks between TG and peer grids
– Software installations for gateways across all TG sites
– Community software areas– Management (pacman, other
options)
Significant Progress in CY2005• January-March
– Initial Gateway interviews and requirements analysis completed
• April– Internal web page
•Project descriptions, RAT reports, staffing, milestones, email archives, presentations
• May– Biweekly calls begin
•Variety of issues discussed, special presentations
– Accounts for all developers– Progress tracking for all gateways– Special presentations
•Edward Walker, gridshell•Lee Liming, GT4
– Address recommendations to and from tg-acctmgmt and security-wg
– Three new RATs•Portal technology (John Cobb)•Web services (Ivan Judson)•OSG (Stuart Martin)
• June– International Science Gateways workshop at
GGF14• August
– Repo area for software exchanges•JDBC SQL for accounting queries to be
first piece of contributed code
• September– Security-wg provides requirements for
community accounts• October
– Gateways provide means to collect required info, expanded user responsibilites form for community accounts in production
– Production community accounts in use (nanoXX, bioportal)
– Discussions with security-wg about portal hosting within TG (NVO, HEP)
– SC05 prep begins– demos, posters, movie clips, images, booth scheduling
– Web Services recommendations complete– “How to become a gateway” at
www.teragrid.org– User-friendly listing of gateways
• November– SC05 focus continues– GT4 deployment evaluation, Mike
Showerman joins call– Special presentations
•GridChem•PURSE and GAMA
– Call with Roy Williams and security-wg to discuss “weak cert” concept
– Gateway plans collected for Program Plan• December
– Finalize Program Plan input– Outline plans for next quarter
Early CY2006 Plans
•CI Channel presentation (March)•Montana State Workshop sponsored by Lariat (March)
– How Grid Computing can Accelerate Research– Special Talks on Bio-informatics and the Grid
•Portal Technology RAT, John Cobb •Account management through User Portal, Eric
Roberts•Audit trails for community accounts•Begin implementation of TG and Gateway provided
web services•Complete further analysis of scheduling requirements
and implementation ideas•Full day training session at TG AHM
Gateways Under the Hood: Open Life Science Gateway and Web Services
• OLSG integrates four components:– Tools from National Microbial Pathogen Data Resource
(http://www.nmpdr.org) and TheSeed (http://theseed.uchicago.edu/FIG/index.cgi)
– Open bioinformatics tools and data– Web Services– TeraGrid resources
• Providing:– Web-based access for account administration, trivial access to
resources, and documentation.– Web service based access to tools, including:
•Taverna, Kepler, other workflow tools•Microsoft Development Environment•Open Source Web Service Toolkits:
– SOAP::Lite [perl], ZSI [python], Apache Axis [c/java]•Bioinformatics toolkits such as BioPerl and BioPython
– Data access• TeraGrid presentation requested at for February NIH meeting
• http://lsgw.mcs.anl.gov/
OLSG Helps Define TG-wide Policies
• Q1FY06 Accomplishments– Web Service Enabled SEED
Software– Developed Life Science
Gateway Architecture– Led Web Services RAT,
working to develop the right model for Gateways with respect to TeraGrid Resources, Security, and User Model
• Q2FY06 Plans– Deploy prototype web/grid
services based TeraGrid hosted access to community developed computational phylogeny tools (e.g., PHYLIP suite)
– Develop strategy for supporting large-scale computing needs for the National Centers for Biomedical Computing (i.e., the BISTI Centers)
Gateways Under the Hood: LEAD, Workflows and Web Services
•Providing tools that are needed to make accurate predictions of tornados and hurricanes•Data exploration and Grid workflow
Log in and see your MyLEAD Space
•x
Creating a workflow for Data Mining
•Use ADaM services from UAH
3DMesocyclone Detection
ESML_Converter
MinMaxNormalizer
Visualization
BayesClassifying
Feature Extraction
Service
Nexrad II RadarData
ESML Descriptor
Data Transformation
Service
Data Normalization
Service
Classification Service
Monitor results in real time
Large workflows can be composed
Educational Resources
Gateways Under the Hood: OSG and Grid Interoperation
• OSG RAT led by Stuart Martin– Implementation of Grid Service Interoperability
•Deploying and Supporting Common Grid Services and Protocols•Creating OSG Gateways
– Basic Grid Interoperability Services•Authentication / Authorization / Accounting (AAA)•Information Services•Job Execution•Data Handling
– User and Application Level Grid Interoperability Services•Resource Discovery / Selection•Resource Brokering•Job Submission and Bookkeeping•Data Management
– Interoperability Quality Assessment•User Support and Troubleshooting•Application Performance
• Grid Interoperability wg formed 12/05
Grid Interoperation
• TeraGrid/OSG Interop work (Stuart Martin et al.) drove organization of a multi-grid interoperation initiative begun in 2005.
• Leaders from TeraGrid, OSG, EGEE, APAC, NAREGI, DEISA, Pragma, UK NGS, KISTI will lead an interoperation initiative in 2006.
• Six international “RATs” will meet for the first time at GGF-16 in February 2006– Application Use Cases
•(Bair/TeraGrid, Alessandrini/DEISA)
– Authentication/Identity Mgmt•(Skow/TeraGrid)
– Job Description Language•Newhouse/UK-NGS
– Data Location/Movement•Pordes/OSG
– Information Schemas•Matsuoka/NAREGI
– Testbeds•Arzberger/Pragma
Leaders from nine Grid initiatives met at SC05 to plan an application-driven “Inerop Challenge” in 2006.
TeraGrid Education, Outreach and Training (EOT)
and External Relations (ER)
Scott LathropArgonne National Laboratory
Director for EOT
Mission, Goals, and StrategiesThe mission is to engage larger and more diverse communities of researchers, educators and learners in discovering, using, and contributing to TeraGrid.
The goals are to:– Enable awareness and access to TeraGrid resources – Provide education and training for all disciplines,
and all stages of learning (K-12 through professional)
– Promote diversity among all TeraGrid activities– Expand the community of users of TeraGrid
The strategies are to – Work with TeraGrid Science Gateways, User Support and the
Core program– Leverage strategic external partnerships and– Assess the community impact.
EOT and ER Team Members Using the User Support model, the GIG coordinates a
TG-wide EOT and ER program with an enthusiastic group of RP and Core/CIP staff.
• Argonne/UChicago: Scott Lathrop, Ray Bair, Joe Insley• CalTech: Sarah Bunn • Indiana: Craig Stewart, Julie Wernert• NCSA: Sandie Kappes, Edee Wiziecki, Mike Freemon, Bill Bell,
Trish Barker• ORNL: John Cobb, Betsy Riley• PSC: Sergiu Sanielevici, Beverly Clayton, Cheryl Begandy, Mike
Schneider, Sean Fulton• Purdue: Sebastien Goasguen, Gary Bertoline, Krishna
Madhavan, Steve Dunlop• SDSC: Diane Baxter, Ange Mason, Don Frederick, Ashley Wood,
Greg Lund, Diana Diehl, Tim Gumto• TACC: Stephenie McLean, Faith Singer-Villalobos
Education Plans and EffectivenessPlans•Professional development for and with UG faculty and
secondary school teachers Development and dissemination of resources including
software, curricular materials, and lesson plans •Mentoring of students in using cyberinfrastructure to
learn math and science, and in pursuing advanced studies
Effectiveness•Leading the SC Education Programs SC05-SC06•NanoHUB used by 10 universities in dozens of UG/Grad
courses.•Scaling-up successful EOT-PACI/EPIC projects (e.g.
TeacherTECH)•External Partnerships: EPIC, NSDL-Computational Science
Education Reference Desk, the National Computational Science Institute, and CIP
SC Education Program Plans and Effectiveness
•Purdue is leading the SC05 and SC06 Education Program, including summer workshops
•TeraGrid Team has been asked to propose a multi-year Education Program starting with SC07-09– Goal is to provide greater continuity and broader,
sustained integration of computational science education for undergraduate education
– Proposal being made to the SC Steering Committee next week to initiate the program in 2006 to prepare for SC07
– Engages a large national planning team representing multiple state and national programs that can help leverage and sustain the program
Outreach Plans and EffectivenessPlans Raise awareness for TeraGrid’s impact on research and
education Engage under-represented people in TeraGrid development
and use, with a focus on MSI college faculty and students Outreach with new communities that have not traditionally
been users of cyberinfrastructure and grid computing
Effectiveness•New Science Gateways: Telescience, BIRN and NEES•Community engagement to applications via professional
society meetings, conferences, and workshops; usage has increased
•External Partnerships: Minority Serving Institution Network, Humanities, Arts, and Social Sciences (HASTAC and CHASS)
Training Plans and EffectivenessPlans•Hands-on training for researchers on topics from introductory
to advanced applications of grid computing. •Training venues include live workshops, Access Grid
sessions, and on-line WebCT courses•Coordination of training opportunities across TeraGridEffectiveness•Review of training materials by experts in the field•Post-workshop surveys by participants assessing the quality•Tracking of WebCT course usage for enhancement•User surveys provide feedback on quality and needs• Identification of needs by ASTA, Science Gateways, and User
Support• Joint workshops and training activities by GIG, RPs, and CIP•PSC is investigating Standardized User Monitoring Suite•Established Partnerships: NMI, National Microbial Pathogen
Data Resource (NMPDR), and CIP
External Relations Plans and Effectiveness
Plans•Promote TeraGrid use and adoption via publicity •Organize public relations efforts •Highlight TeraGrid’s value via communications•Communications of tech changes for smooth transitions •Provide internal communications strategies for all of TG
Effectiveness•Press Releases, news stories, science nuggets•Publications – TeraGrid brochure, user publications lists•Website – increased usage•Presentations – multiple venues and multiple events•Event Management and Logistics (e.g. SCxx)•External Partnerships: OSG, ASCRIBE,GridToday, HPCwire