FROM EGEE TO EGI: THE ROLE OF VIRTUAL RESEARCH COMMUNITIES IN MOLECULAR AND MATERIALS SCIENCE Antonio Laganà* Department of Chemistry, University of Perugia,

Post on 12-Jan-2016

214 Views

Category:

Documents

1 Downloads

Preview:

Click to see full reader

Transcript

FROM EGEE TO EGI: THE ROLE OF VIRTUAL RESEARCH COMMUNITIES IN MOLECULAR

AND MATERIALS SCIENCE

Antonio Laganà*

Department of Chemistry, University of Perugia, Italy * With the collaboration of several members of the

COMPCHEM Virtual Organization

SUMMARY• THE EGEE GRID AND ITS IMPLICATIONS

FOR COMPUTATIONAL MOLECULAR AND MATERIALS SCIENTISTS

• PAVING THE WAY TO EGI • FROM SIMBEX (SIMULATOR of MOLECULAR

BEAM EXPERIMENT) TO GEMS (GRID EMPOWERED MOLECULAR SIMULATOR)

• FROM COMPCHEM TO CMST• GRIDIFICATION APPROACHES• FORWARD LOOKING

1 - THE EGEE GRID AND ITS IMPLICATIONS FOR COMPU-TATIONAL MOLECULAR AND

MATERIALS SCIENTISTS

The european seminal implementation of the Grid and the assemblage of the COMPCHEM Virtual Organization

“A computational Grid is a hardware and software

infrastructure that provides dependable, consistent,

pervasive and inexpensive access to high-end

computational capabilities.”

Ian Foster, The Grid: Blueprint for a future computing infrastructure (1999)

The Grid: from dreams to reality

                                                                      

                                                                                                                                                             

THE PERVASIVITY OF THE EGEE PRODUCTION GRID

THE EGEE PRODUCTION GRID• EGEE is a European project aimed at developing a

European grid infrastructure for science with links to US, Latin America, India and China grids.

• In the first biennium little support (NA4 Activity Application Identification and Support) was given to chemistry.

• Starting from the second biennium the Beam Molecular simulator (SIMBEX) was produced and the Chemistry virtual organization (VO) COMPCHEM admitted as unfunded

• In the third biennium a prototype version of the Grid Molecular Simulator GEMS was designed and implemented

• On public network• Out of shelves technology (from PC to

supercomputers)• Evolutionary approach• Aggregated local nodes (the Perugia

case)

THE COST EFFECTIVENESS OF THE EGEE PRODUCTION GRID

The initial Beowulf-Mosix “GRID”

front-end + 15 nodes 2 proc. PIII 1.0 Ghz, 2 Gbytes RAM, NIC Intel e1000

Gigabit Ethernet

Switch 3Com Gigabit Ethernet 16 port

Hybrid architecture: Beowulf MOSIX

The additional cluster “GRID”

front-end + 40 nodes proc. Intel Xeon Quadcore X3210 2.13 GHz,

164 GB RAM, 8 Mb Cache L2 MB (2x6) Level 2

RJ 45 Ethernet

Switch 3Com 2 Switch Gigabit Ethernet 48 ports

FURTHER ESPANSION OF THE PERUGIA NODE

• Coordination to the original nucleus of scientists from Computer Science and Chem-dynamics with those of the local section of INFN, CNR, Chem-electronics, Drug-design.

• Gathering together the related hardware (different Tier3) and software tools and experimenting new ones (like GPUs, workflows and framework)

• Assembling the specific packages of the different scientific areas

• Widening the service area in grid porting, training and education.

FURTHER ESPANSION OF COMPCHEM

• Increase the number of users.• Increase the number of programs• Improvement of the support to users (registration, porting,

training (2 schools), …)• Connection with other VOs and application to INFRA-

2010 as part of the ROSCOE application.

High perfor-mance nets

Networks Fiber optics

Portals Security Communications

Resource Management MonitoringMiddleware

HP Components Problem Solving

Libraries Cost models

Program-Ming tools

Applications

Astrophysics Bioinformatics Earth observation

Geophysics Computational Chemistry

THE DEPENDABILITY OF THE EGEE PRODUCTION GRID

• NO ADEQUATE BANDWIDTH and RELIABILITY of public networks

• NO STANDARD MIDDLEWARE (Glite, Arc, Unicore)• NO EFFICIENT PARALLELIZATION TOOLS (MPI

Libraries), PORTALS, WORKFLOWS • NO ESTABLISHED DATA AND PACKAGE MODELS

AND STANDARDS

THE CONSISTENCY AND DEPENDABILITY OF THE EGEE PRODUCTION GRID

2 - PAVING THE WAY TO THE EUROPEAN GRID

INITIATIVE (EGI)The structuring of a new true pan-european grid

infrastructure

MISSION and STRUCTURE

• Support international research teams and projects by means of an interna-tional infrastructure to share data (knowledge) and compute resources

• Common infrastructure– national funding of computing research

infrastructures via NGI platforms– coordination through EGI.ORG– steering by User Communities

EGI Basic Elements

• EGI ORGANIZATION– EGI.ORG a light coordination body

• Central location + decentralized bodies• Synergy for EU level added value• Coordination activities• Links with external bodies (Consortia, ..)

– NGIs Stakeholders of EGI.ORG• national funding• own agenda and tasks

EGI Stakeholders

NGI2NGI 1

NGIn

Research InstitutesResearch InstitutesResearch TeamsResearch Teams

NGIsNGIsResource CentresResource Centres

EGI.orgEGI.org

NGI User Community Tasks

1. VO Registration and VO Database2. Site Validation Tests3. Core VO Service Provision4. Help Desk and User Technical Support5. Documentation6. Help Desk for Application Porting7. Case Studies8. Consulting9. Application Database10. Development of Services (Grid Planning)

NGI User Community Tasks

11. Integration of Domain’s Resources12. Feedback13. Dissemination14. Community-Specific Gateways and Help Desk15. Validation of Site Resources/Services16. Coordination17. User Conference – User Forum Events18. Technical Coordination Grid Planning19. Regional Coordination

EGI User Community Goals1. Gathering requirements from the user communities. 2. Carrying out a review process to integrate useful “external”

software3. Establishing Science Gateways that expose common tools

and services to user communities in the various disciplines (specialized support center, SSC).

4. Establishing technical collaborations with the large ERI projects

5. Providing “umbrella” services for collaborating projects, (e.g. maintenance of repositories, FAQs, wikis, etc.)

6. Maintaining a European Grid Application Database that allows applications to be “registered”

7. Organising European events such as the User Forum meetings and topical meetings

8. Providing services for new communities9. Ensuring high quality documentation and training services.

OTHER ACTORS• ass. members: EIROs (Cern, Esa, Ebi, ..) - supplement NGIs for services & resources in specific

sectors• partners: MiddleWare Consortia (gLite, Unicore, arc)

– provide the OS middleware

EGI Management/Governance

EGI.orgEGI Director

Non-voting Representativesextra-EU NGIs, Chair of UFSC, …

Associate Memberse.g. EIROforum member, …

MembersNGI1, NGI2, NGI3, … NGIn

EGI Council

AdvisoryCommittees

e.g.Middleware

CoordinationBoard(MCB)

User Forum Steering

Committee(UFSC)

UserForum (UF)

MiddlewareUnit

Administration& PR Unit

OperationsUnit

User Community Services

CTOMiddleware

Maintenance

CAOAdmin & PR

UCOUser Coordination

COOOperations

FROM EGEE to EGI • January 20th 2009: Vote for approval of the EGI Blueprint by the

EGI_DS Policy Board; first list of NGIs subscribing to the principles of EGI.

• March 2nd 2009: Catania Workshop – Approval of AMSTERDAM as the EGI location; common work plan with EGEE on transition scenario.

• Spring 2009: Transition team in place with authority to prepare key tasks and to negotiate with the EU; work on calls for EC funding

• Summer 2009: The core of the EGI project transition team is agreed and confirmed by the Policy Board; latest date for formal establishment of EGI including location.

• Autumn 2009: The EGI project proposal is prepared and submitted for approval to the EC.

• January 1st, 2010: EGI is operational, with all key personnel being appointed (who may not yet be working for EGI, as e.g. still working for EGEE III or any other project).

• April 2010: EGI takes over from EGEE-III

3 – FROM SIMBEX (SIMULATOR of MOLECULAR BEAM EXPERIMENTS)

TO GEMS (GRID EMPOWERED MOLECULAR SIMULATOR)

- O. Gervasi, A. Lagana’, SIMBEX: a portal for the a priori simulation of crossed beam experiments,  Future generation Computer Systems, 20(5), 703-716 (2004)- O. Gervasi, C. Dittamo, A. Lagana’, A Grid Molecular simulator for E-science, Lecture Notes in Computer Science 3470, 16-22 (2005).

A sistematic grid approach to molecular and materials science simulations

RESEARCH PROJECTS CHEMISTRY COMPUTING ON THE

NETWORK• EU: Data grid, Digital libraries, …… COST

(D23, (1999) METACHEM Metalaboratories (virtual laboratories made of geographically dispersed laboratories) for computational chemistry complex applications; D37 (2004) GRIDCHEM computational chemistry applications for Grid computing).

• NATIONAL: analogous project funded on National resources.

THE CROSSED BEAM EXPERIMENT of Perugia

MEASURABLES- Angular and time of flight product distributions

INFORMATION OBTAINABLE- Primary reaction products- Reaction mechanisms- Structure and life time of transient- Internal energy distribution of products- Key features of the potential

The concurrent TRAJECTORY kernel

TRAJ

return

Iterate over initial conditionsthe integration of individualtrajectories (ABCTRAJ, etc.)

Define quantities of generaluse

Collect individual trajectory results

VIRTUAL MONITORS FOR COMPUTED PRODUCT ANGULAR DISTRIBUTIONS OF THE VARIOUS CHANNELS

H+ICl→Cl+HI

H+ICl→H+ICl H+ICl→HCl+I

KNOWLEDGE FLOW OF GEMS A GRID EMPOWERED MOLECULAR SIMULATOR

Interaction

Statistics

Dynamics

Virtual Monitors

System input

The INTERACTION module

INTERACTION

DYNAMICS

Is therea suitable PES?

Are ab initiocalculationsavailable?

Are ab initiocalculations

feasible?

NO NO NO

YES YES

YES

START

FITTING

SUPSIM

Are dynamics

calculationsdirect?

NOImport the PES routine

Take a databaseforce field

SUPSIM: the concurrent Ab initio approach

SUPSIM

return

Iterate over the systemGeometries the call of ab

initio suites of codes (GAMESS, GAUSSIAN,

MOLPRO, etc)

Define the characteristics of the ab initio calculation, the coordinates used and the

Variable’s intervals

Collect single molecular geometry energy

L. Storchi, F. Tarantelli, A. Lagana’, Computing Molecular energy surfaces on the grid, Lecture Notes in Computer Science 3980, 675-683 (2006).

AB INITIO CALCULATIONS

• Methods - wavefunction quantum approaches (MRCI) - density functional theory (DFT)

• Programs: often standard packages - ACADEMIC like GAMESS US - COMMERCIAL like GAUSSIAN

The FITTING Module

FITTING

Return

Are asym-ptotic values

accurate?

Are remai-ning valuesinaccurate?

Do ab initiovalues have the

proper sym-metry?

Enforce the propersymmetry

Application using fitting programs to

generate a PESroutine

Modify asym-ptotic values

NO NONO

Modify short andlong range values

YES YESYES

The DYNAMICS module

DYNAMICS

OBSERVABLES

Exact quantum

calculations?

NO NO

YES YES

CLASSICALIntegration of the

classicalequations of

motion

APPRQDYNIntegration of the approximate or mixed QM and QC dynamicsequations

QDYNIntegration of theexact quantum

dynamics equations

SEMICLASSICALIntegration of clas-sical equations of motion and of the

associated classicalaction

YES

NO Ap-

proximate quantumcalculations?

Se-miclassical

calcula-tions?

The QDYN PROCEDURES

QUANTUMDYNAMICS

OBSERVABLES

Single Initial

quantum state?

Multiple initial

quantum states?

NO NO

YES YES

CRP: cumulative

reaction probabilities and TransitionState theory

TI: single energyatom diatom

S matrix elements for all

Initial states

TD: single initial state atom

diatom S matrix elements

for several energies

MCTDH: reactive flux flux correla-

tion functionmethod

Statespecific

(summed overfinal states)

YES

Fully averaged

The concurrent time dependent approach

TD

return

•Iterate over initial conditions•the time propagation •(RWAVEPR, CYLHYP, etc.)

Define quantities of generaluse

•Collect single initial state•S matrix element

The concurrent time independent approach

TI

return

Iterate over total energy value the integration of scattering

equations

Define quantities of generaluse including the integration

bed

Iterate over the reaction coor-dinate to build the interaction

matrix

Broadcast coupling matrix

Collect coupling matrix elements

Collect state to state S matrix elements

The CLASSICAL PROCEDURES

CLASSICALDYNAMICS

OBSERVABLES

Few single body

problem?

Few largebody

problem?

NO NO

YES YES

Simplified or approaches

DL_POLY, GROMACS:

variousensembles

calculations

VENUS: sfew body trajectory

calculations

DLPOLY,GROMACS:

reduceddegrees of

freedom

Manysmall body

problem?

YES

Fully averaged

Using history files to rationalize mechanisms

RECROSSING IN OH + HCl → H2O + Cl

DIATOM-DIATOM REACTIVE

PROCESSES

QuickTime™ and aCinepak decompressor

are needed to see this picture.

4 – FROM THE COMPCHEM VO TO CMST SSC

•Global approaches prompt collaboration, know how sharing and service providing•Collaboration prompts an evaluation of the commitment (including environmental care and social fairness) and of the productivity as well as the establishing of an economy

A. Lagana’, A. Riganelli, O. Gervasi,On the structuring of the computational chemistry virtual organization COMPCHEM,Lecture Notes in Computer Science 3980, 665-674 (2006).

•COMPCHEM VO (http://compchem.unipg.it)

is a virtual organization coordinated by the Perugia University running on the EGEE production Grid from the end of 2004 80 (system, development, application) users 8000 CPUs (~8% of the EGEE resources) Strong ties with two COST actions: D23 (METACHEM, 1999) and D37 (GRIDCHEM, 2005) Tight connections with other VOs of the Computational Chemistry cluster (eg. GAUSSIAN)

• COMPCHEM ITALIAN Support sites se.grid.unipg.it (UNI-Perugia)

se-01.grid.sissa.it (SISSA-Trieste) gridsrm.ts.infn.it (INFN-Trieste) prod-se-01.pf.infn.it, prod-se-01.pf.infn.it Italian (INFN-Padova) grid-e0-engine04.esrin.esa.int (ESA-esrin) cmsdcache.pi.infn.it, gridse.pi.infn.it (INFN-Pisa) grids.sns.it (SNS-Pisa) aliserv1.ct.infn.it (INFN-Catania) egse.frascati.enea.it, egse.cresco.portici.enea.it (GRISU.ENEA.Grid) spacin-wn03.dna.unina.it (GRSU-SPACI-Napoli) t2-dpm-01.na.infn.it (INFN-Napoli-Atlas) grid2.fe.infn.it (INFN-Ferrara) grid003.ca.infn.it (INFN-Cagliari)

• COMPCHEM EUROPEAN Support sites

plethon.grid.ucy.ac.cy (CY-01-Kimon)

grid05.lal.in2p3.fr, polgrid4.in2p3.fr (GRIF) se02.marie.hellasgrid.gr, se01.marie.hellasgrid.gr (GR-06-iasa) se01.grid.uoi.gr (GR-10-uoi) se01.isabella.grnet.gr (HG-01-grnet) se01.afroditi.hellasgrid.gr (HG-03-auth) se01.kallisto.hellasgrid.gr (HG-04.cti-ceid) se01.ariagni.hellasgrid.gr (HG-05.forth) se01.athena.hellasgrid.gr (HG-06.ekt) gridstore.cs.tcd.ie (csTCDie) se.reef.man.poznan.pl (PSNC) se2.egee.cesga.es (CESGA-EGEE) se2.ppgrid1.rhu1.ac.uk (UKI-lt2-rhul)

• COLUMBUS Vienna (Austria) high-level ab initio molecular electronic structure calculations.

• GAMESS-US Catania (Italy) high-level ab initio molecular quantum chemistry

• ABC Perugia (Italy), Budapest (Hungary) quantum time-independent reactive dynamics

• RWAVEPR Perugia (Italy), Vitoria (Spain) quantum time-dependent reactive dynamics

• MCTDH Barcelona (Spain) multi-configurational time-dependent Hartree method

• FLUSS Barcelona (Spain) Lanczos iterative diagonalisation of the thermal flux operator • DIFF REAL WAVE Melbourne (Australia) quantum differential cross-section

(work in progress)

• VENUS Vitoria (Spain) classical mechanics cross sections and rate coefficients

• DL_POLY Iraklion (Greece), Perugia (Italy) molecular dynamics simulation of

complex systems • CHIMERE Perugia (Italy) chemistry and transport eulerian model for air quality

simulations

COMPCHEM Applications

Millions of cpu hours consumption

From the EGEE Accounting Portal at the Centro de Supercomputación de Galiciahttp://www3.egee.cesga.es/gridsite/accounting/CESGA/egee_view.html

The share of COMPCHEM

THE COMPCHEM MEMBERSHIP1. USER

PASSIVE : Runs other’s programs

ACTIVE: Implements at least one program for personal usage

2. SW PROVIDER (from this level on one can earn credits)

PASSIVE : Implements at least one program for other’s usage

ACTIVE: Management at least one implemented program for

cooperative usage

3. HW PROVIDER

PASSIVE : Confers to the infrastructure at least a small cluster

of processors

ACTIVE: Contributes to deploy and manage the structure

4. MANAGER (STAKEHOLDER): Takes part to the development and the management of the virtual organization

• Further information at http://compchem.unipg.it

THE PLANNED SSC CMST

1. GATHER EXISTING VOs IN CHEMISTRY AND MATERIALS SCIENCE and TECHNOLOGIES (COMPCHEM, GAUSSIAN, ….) IN A SINGLE SSC (CMST)

2. ATTRACT NEW RESEARCH GROUPS AND LABORATORIES ACTIVE IN THE FIELD

3. REPRESENT THE RELATED VOs at EGI USER FORUM AND STEERING COMMITTEE LEVEL

4. INTERACT WITH THE OPERATIONAL AND USER SUPPORT UNITS OF EGI

5. DESIGN A DEVELOPMENT STRATEGY FOR THE VOS OF THE AREA

6. PROVIDE TRAINING OPPORTUNITIES AND COORDINATE DISSEMINATION ACTIVITIES

5 – FURTHER GRIDIFICATION ACTIVITIES

APPLY THE DECOMPOSITION METHODS TO OTHER PROGRAMS AND USE GRID PORTALS

A Grid Implementation of Direct Semiclassical Calculations of Rate Coefficients, 5592, 93 (2009), A. Costantini, N. Faginas Lago, A. Lagana, and F. Huarte A Grid Implementation of Direct Quantum Calculations of Rate Coefficients, 5592, 104 (2009), A. Costantini, N. Faginas Lago, A. Lagana, and F. HuarteA Grid Implementation of Chimere: Ozone Production in Central Italy, 5592, 115 (2009), A. Lagana, St. Crocchianti, Alessandro Costantini, Monica Angelucci, and Marco VecchiocattiviPorting of the GROMACS package into the Grid Environment: testing of a new distribution strategy, 6019, 1-12 (2010), A. Costantini, E. Gutierrez, J. Lope Cacheiro, A. Rodriguez, O. Gervasi, A. Lagana,Accurate quantum dynamics on platforms: some effects of long range interactons on N+N2 reactivitiy, 6019, 41-52(2010), S. Rampino, F. Pirani, A. Lagana, E. Garcia

Lecture notes in Computer Science

recent papers

THE MCTDH METHOD• Diagonalisation of the thermal flux

operator defined onto a dividing surface to build a reduced Krylov subspace (iterative diagonalisation by consecutive application of the thermal flux operator on a trial wave function). The outcome is a set of eigenvalues and eigenstates of the thermal flux operator.

• Time propagation of the thermal flux eigenstates employing MCTDH.

• Calculation of observables: k(T), N(E).

THE FLUSS PROGRAMcalculate the individual

eigenfunctions

TIME INTEGRATIONdistribute the individual

propagations

FURTHER GENERALIZATION OF QUANTUM DYNAMICS

• Broaden the offering of cooperating/competing packages as web services

• Avoid electron-nuclei separation (Born-Oppenheimer) and generalize coordinates to N-body problems

• Introduce easy ways of composing packages

GENERALIZE GEMS WORKFLOWS

• Inter-job workflow

- Wrap the jobs

- Treat the jobs as objects

- Define composition rules and data links

• Intra-job workflows

- Define tools as for inter-job workflows via directives to be inserted inside the jobs

PGRADE ABC workflow

Gridification of ABC classical command line interface P-GRADE Grid Portal 2.7

Executor: executed as many times in parallel

as many parameters are generated by “Generator”

Collector: collects all output files into a single

TAR file

Generator: generates input files with different parameters

Execution of 4 ABC parameter

study jobs for

F + HD

reaction varying jmax and rmax

on

- a local machine (P4 3.4GHz,

1GByte RAM)

- 4 WMS selected clusters that

support COMPCHEM VO

Better speed-up can be achieved

with more parameter jobs

Performance

0

500

1000

1500

2000

2500

Results of ABC

Time grid

Time local

ABC

Tim

e (

min

)

Execution of 500 ABC

parameter study jobs for

F + HD

reaction on

- a local machine (P4 3.4GHz,

1GByte RAM)

- WMSs selected clusters that

support COMPCHEM VO

Performance

0

50000

100000

150000

200000

250000

300000

Results of ABC

Time grid

Time local

ABC

Tim

e (

min

)

6 – FORWARD LOOKING

DEVELOP A (COLLABORATIVE) GRID ECONOMY

• Service oriented approaches

• QoS and QoU

• Credit system and cost of services

C. Manuali – A. Laganà University of Perugia (IT)

CGW’09 Krakow (PL) – October 12-14, 2009GriF: a collaborative tool for grid

empowering to computational applications

• GriF is meant to make grid applications black box like and to push the grid computing to a higher level of transparency (Clouds Computing) in which better memory usage, reduced cpu and wall times consumption as well as an optimized distribution of tasks over the grid are automatically performed.

• GriF is a collaborative JAVA Service Oriented Architecture (SOA) framework which provides grid services aimed at exploiting the articulation of computational applications in sequential, concurrent or alternative paths on the EGI Grid by adopting SOA and Web Service standard technologies.

• GriF improves the grid by providing the VO or SSC users with standard operational modalities based on friendly user driven services. Moreover, GriF creates collaborations to add value for all parties involved also by working with service providers which can offer applications to users by composing one or more services without knowing their implementation details.

C. Manuali – A. Laganà University of Perugia (IT)

CGW’09 Krakow (PL) – October 12-14, 2009

GriF in the Grid scenarioThe SOA organization consists essentially of two JAVA servers

and the JAVA client. The two JAVA servers are YR (Yet a Registry, used to drive the initial discovery of the Web Services offered by the VO or the SSC) and YP (Yet a Provider, used to hold the VO or SSC Web Services). The JAVA client is YC (Yet a Consumer, used to interact with GriF in Wizard/Expert mode).

In the top part of the figurephases 1 and 2 show the services discovery and

phases 3-7 show a typical program execution

performed on the EGI Grid in which the selected YP takes care of running the job on the associated User

Interface (UI).

In the bottom part of the figure the grid proxy management and its YC interactions are shown.

C. Manuali – A. Laganà University of Perugia (IT)

CGW’09 Krakow (PL) – October 12-14, 2009

GriF @ Work (Wizard Mode)

1 - Using the “Framework Management” tab to create the Grid Proxy and check the GriF Status

2 - Using the “Wizard Mode” to start the Grid Job (Parametric Jobs on

EGI for the ABC program), check the

Job’s Status and retrieve the results

AIR POLLUTION SIMULATION

CPM10 Concentration from CHIMERE-aerosols

Gas hydrates (Clathrates): water hydrogen bonded structures caging gas molecules

• ClCl22

• HH22SS

• COCO22

• CHCH44

• HH22

• etc.etc.

HYDROGEN HYDRATE

ACKNOWLEDGEMENTS

• CDK group, Dept. Chemistry, Perugia (Crocchianti, Faginas, Pacifici, Skouteris, Costantini, Rampino, Manuali)

• HPC group, Dept. Math&Inf, Perugia (Gervasi, Tasso)

• Qdyn group, COST D37 (Garcia, Huarte, Lendvay, Nyman, Balint-Kurti, Farantos)

• Other groups of COST D37• COST-ESF, EU-FP7, MIUR (It), ESA funding

TANKS FOR YOUR ATTENTION

top related