National Center for Supercomputing Applications Cloud Resources in Production Cyberenvironments for E- Science Virtual Organizations GridChem/ParamChem Interoprability NSF Cloud ComputingWorkshop Arlington, VA 17-18 Mar 2011 Sudhakar Pamidighantam NCSA, University of Illinois at Urbana-Champaign [email protected]
40
Embed
National Center for Supercomputing Applications Cloud Resources in Production Cyberenvironments for E-Science Virtual Organizations GridChem/ParamChem.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
National Center for Supercomputing Applications
Cloud Resources in Production Cyberenvironments for E-Science Virtual
Organizations GridChem/ParamChem
Interoprability
NSF Cloud ComputingWorkshop
Arlington, VA 17-18 Mar 2011
Sudhakar PamidighantamNCSA, University of Illinois at
• Jayeeta Ghosh, NCSA, ParamChem• Suresh Marru, Indiana U. OGCE • Ye Fan, Indiana U. OGCE• Kenno Vonnommeslaeghe, U. Maryland/Paramchem, • Narendra Polani, UKy, Middleware/ParamChem• Michael Sheetz, UKy, Application Interfaces/ ParamChem• Vikram Gazula, UKy, Server Administration• Tom Roney, NCSA, Server and Database Maintenance• Nikhil Singh, NCSA, Paramchem
• Liu Yang, NCSA, GridChem• Scott Brozell, OSC, Applications and Testing• Rion Dooley, TACC Middleware Infrastructure• Stelios Kyriacou, OSC Middleware Scripts• Chona Guiang, TACC Databases and Applications• Kent Milfeld, TACC Database Integration • Kailash Kotwani, NCSA, Applications and Middleware
National Center for Supercomputing Applications
Outline• Historical Background : --- Grid Computational Chemistry• Production Environments• Current Status Web Services • Usage:Grid and Science Achievements• Cloud in Hybrid Environments• Interoperability• Future
National Center for Supercomputing Applications
MotivationIntegrating Services for E-Science and
Engineering inResearch, Education and TrainingSoftware - Reasonably Mature and easy to use to address
chemists questions of interestCommunity of Users - Need and capable of using the software Some are non traditional computational chemistsResources - Various in capacity and capability - Distributed and heterogeneous
National Center for Supercomputing Applications
Extended TeraGrid Facility
www.teragrid.org
National Center for Supercomputing Applications
NSF Petascale Road Map• Track I Scheme Multi-petaflop single site system to be deployed by 2011 at NCSA BlueWaters http://www.ncsa.illinois.edu/BlueWaters/
• Track 2 Sub-petaflop systems Several to be deployed until Track 1 is online System OS Cores• Dell PowerEdge(NCSA) EM64T 9600• SGI-Altix(PSC) IA64 768• SGI UV-Ice(NCSA) EM64T 1568 • IBM Power4 Cluster(NCSA) Pwr4 48• IBM PowerPC(Indiana) Pwr4 1536• Sun Constellation (TACC) EM64T 50000
Additional Systems to be online soon (currently being allocated) SGI UV-Ice(PSC) EM64T 4096 FutureGrid Diverse on demand
National Center for Supercomputing Applications
Grids and New OpportunitiesAlliance to TeraGridHomogenous Grid with predefined fixed software and
system stack was planned (Teragrid) but it was difficult to keep it homogenous
Local preferences and diversity leads to heterogeneous grids now! (Operating Systems, Schedulers, Policies, Software and Services)
Openness and standards that lead interoperability are critical for successful services
Grid Hard-ware
Middleware
Scientific ApplicationsInterfacesInterfaces
National Center for Supercomputing Applications
User CommunityChemistry and Computational Biology
NRAC AAB Small Allocations As of Oct 04
#PIs 26 23 64
#SUs 5,953,100 1,374,100 640,000
TeraGrid Allocations in 2010
Discipline # PIs Initial Alloc. SUs Physics 125 920,254,700
GridChem User Services• Allocationhttps://www.gridchem.org/allocations/index.shtmlCommunity and External Registration Reviews, PI Registration and Access Creation Community User Norms Established
• Azide Reactions for Controlling Clean Silicon Surface Chemistry: Benzylazide on Si(100)-2 x 1Semyon Bocharov et al..J. Am. Chem. Soc., 128 (29), 9300 -9301, 2006
• Chemistry of Diffusion Barrier Film Formation: Adsorption and Dissociation of Tetrakis(dimethylamino)titanium on Si(100)-2 × 1 Rodriguez-Reyes, J. C. F.; Teplyakov, A. V.J. Phys. Chem. C.; 2007; 111(12); 4800-4808.
• Computational Studies of [2+2] and [4+2] Pericyclic Reactions between Phosphinoboranes and Alkenes. Steric and Electronic Effects in Identifying a Reactive Phosphinoborane that Should Avoid Dimerization Thomas M. Gilbert* and Steven M. Bachrach Organometallics, 26 (10), 2672 -2678, 2007.
National Center for Supercomputing Applications
Science Enabled• Chemical Reactivity of the Biradicaloid (HO...ONO) Singlet
States of Peroxynitrous Acid. The Oxidation of Hydrocarbons, Sulfides, and Selenides. Bach, R. D et al. J. Am. Chem. Soc. 2005, 127, 3140-3155.
• The "Somersault" Mechanism for the P-450 Hydroxylation of Hydrocarbons. The Intervention of Transient Inverted Metastable Hydroperoxides. Bach, R. D.; Dmitrenko, O. J. Am. Chem. Soc. 2006, 128(5), 1474-1488.
• The Effect of Carbonyl Substitution on the Strain Energy of Small Ring Compounds and their Six-member Ring Reference Compounds Bach, R. D.; Dmitrenko, O. J. Am. Chem. Soc. 2006,128(14), 4598.
National Center for Supercomputing Applications
Distribution of GridChem User Community
National Center for Supercomputing Applications
Job Distribution
National Center for Supercomputing Applications
System Wide UsageHPC System Usage (SUs)
Tungsten(NCSA) 5507
Copper(NCSA) 86484
CCGcluster(NCSA) 55709
Condor(NCSA) 30
SDX(UKy) 116143
CCGCluster(UKy) .5
Longhorn(TACC) 54
CCGCluster(OSC) 62000
TGCluster(OSC) 36936
Cobalt(NCSA) 2485
Champion(TACC) 11
Mike4 (LSU) 14537
Force Field ParameterizationMolecular Force Fields require constant improvement
as new reference data becomes available (that can not be accommodated easily with existing sets)
New molecular systems become amenable for computational analysis
New models/potential energy functions/Hamiltonians for force are established
Coverage of force fields should constantly be extended to cover new fields of research/new functionality (nanomaterials, biomaterials and medicine,...)"
Cyberenvironments for Molecular Force Fields
• Extension of currently available models, with the resulting parameters sets to be made available publicly
• Databases of experimental and quantum mechanical reference data to be used in the parameterization process
• Integration of computational resources for data acquisition, automation of QM reference data generation
• Automation Extensible infrastructure for parameterization management for rapid and systematic parameterization of novel Hamiltonians (empirical and semi-empirical)
• Systematic improvement of parameter optimization processes
Accurate Force Fields Are needed
Published by AAAS
A. J. Stone Science 321, 787 -789 (2008)
Fig. 1. Errors (V) in electrostatic potential on a surface at 1.8 times van der Waals radii around N-methyl propanamide for two models. (Left) Point charges; (right) charge, dipole, and quadrupole on C, N, and O; charge and dipole on H. The errors are much reduced in the multipole approach
OGCE Layered Workflow Architecture:Derived from LEAD Workflow System
Workflow Execution &
Control Engines
Workflow Execution &
Control Engines
Apache ODE
Workflow Specification
Workflow Specification
Workflow Interfaces (Design & Definition)
Workflow Interfaces (Design & Definition)
PythonBPEL 2.0
BPEL 1.0 Java Code Pegasus DAG
Scufl
XBaya GUI (Composition,
Deploying, Steering & Monitoring) Gadget Interface for
Input Binding
Condor DAGMan
Taverna
Dynamic Enactor
Jython InterpreterGBPEL
Flex/Web Composition
Putting It All Together
Pegasus WMS
35
ParamChem-Xbaya-Pegasus• Input Workflow for GridChem/ParamChem created using Pegasus
JAVA DAX API
-- DAX can have combinations of tasks ( like Charmm/ multiple Gaussian tasks) each taking respective input file.
• The tasks can be mapped to either respective specific applications (like charmm/amber/g03 or g09 )based on a simple configuration.
• Input data (instructions, structure, topology, parameters) will be staged from middleware using GridFTP to the execute clusters (such as TeraGrid systems Mercury and Abe at NCSA).
• Jobs will be distributed across the multiple execute clusters using Round-Robin or other schema.
-- Any heuristics based scheduling is also possible.• Output files will be staged back from execute clusters to middleware
using GridFTP for post processing/archiving.
36
National Center for Supercomputing Applications
Some New GridChem Infrastructure• Workflow Editors• Coupled Application Execution• Large Scale Computing• Metadata and Archiving • Rich Client Platform Refactorization• Intergrid Interactions
• Open Source Distribution http://cvs.gridchem.org/cvs/
• Open Architecture and Implementation details http://www.gridchem.org/wiki
The Cloud in our case is a part of over all resources for computing and storage They have to be usable interoperably along with other HPC and local resourcesParticular use will be for on-demand computing and high throughput computingCertain routine sensor enabled data dependent computing hydrological event monitoring and simulation could be handled by clouds for rapid on demand prediction of short term eventsThe interoperability requirements that enable data and computation movement from one resource to other should be explored.