Top Banner
Environment from the Molecular Level: An e-science project for modelling the atomistic processes involved in environmental issues (funded by NERC)
19

Environment from the Molecular Level: An e-science project for modelling the atomistic processes involved in environmental issues (funded by NERC)

Dec 28, 2015

Download

Documents

Alice Andrews
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Environment from the Molecular Level: An e-science project for modelling the atomistic processes involved in environmental issues (funded by NERC)

Environment from the Molecular Level:

An e-science project for modelling the atomistic processes involved in environmental issues

(funded by NERC)

Page 2: Environment from the Molecular Level: An e-science project for modelling the atomistic processes involved in environmental issues (funded by NERC)

Radioactive waste disposal

Crystal growth and scale inhibition

Pollution: molecules and atoms on mineral surfaces

Crystal dissolution and weathering

Molecular Environmental Issues

Page 3: Environment from the Molecular Level: An e-science project for modelling the atomistic processes involved in environmental issues (funded by NERC)

Rocks and Mineral StructuresRadioactive waste disposal

Crystal growth and scale inhibition

Pollution: molecules and atoms on mineral surfaces

Crystal dissolution and weathering

Page 4: Environment from the Molecular Level: An e-science project for modelling the atomistic processes involved in environmental issues (funded by NERC)

The “Grand Challenge”.

Level of theory

Adsorbing surface

Contaminant

Quantum Monte Carlo

Large empirical models

Linear-scaling quantum mechanics

Organic molecules

HalogensMetallic elements

Cla

ys,

mic

as

Alu

min

osili

cate

s

Nat

ura

l org

anic

mat

ter

Pho

sph

ates

Car

bona

tes

Oxi

des/

hydr

oxid

es

Sul

phi

des

Requires scientists to work together in teams - a Virtual Organisation

Page 5: Environment from the Molecular Level: An e-science project for modelling the atomistic processes involved in environmental issues (funded by NERC)

DesignApproach taken:

– Over approx 3 years we have engaged in many workshops, tutorials and prototyping with developers and users. Teaching users what e-Science can “do for them”, including security.

• Cooperation between CCLRC and NIEeS in Cambridge.

– Planned to integrate together some tools which had already been developed/ prototyped at CCLRC, UCL and Reading.

• A service-oriented approach is used for certain aspects: Grid, data management, user interfaces, metadata management. Workflow was found to be important to users, e.g. for combinatorial studies.

• Several iterations of software have enabled some usability issues to be addresses.

– Originally envisaged an “Integrated Portal Architecture” linking HPCPortal, DataPortal and visualisation services.

• We thought we knew what users would like, but actually they preferred a simpler incremental approach;

• Workflow scripting was preferred to a single portal. There are now several separate tools in use.

Page 6: Environment from the Molecular Level: An e-science project for modelling the atomistic processes involved in environmental issues (funded by NERC)

E-Minerals Portal

Page 7: Environment from the Molecular Level: An e-science project for modelling the atomistic processes involved in environmental issues (funded by NERC)

Technical Strategy

• Technology considerations:

– Considered: Globus GT2, SRB, Harness, CCF, Portal, Web services, visualisation tools

• Various tool sets were tried and the users “voted with their feet”

– Used: Globus, Condor, SRB, AG, MAST, RCommands, Metadata Editor, Workflow scripts, Web services, XML/ RDF/ OWL for data interoperability.

• Infrastructure

– E-Minerals “mini-Grid” was a great success, based on earlier work at Daresbury and Manchester on Grid evaluation. Mini-Grid focuses resources of the e-Minerals VO and includes large campus Condor pools and parallel computers. Using Globus, Condor and GSI. Data managed using SRB.

• Collaboration tools

– Access Grid, MAST, Wiki

Page 8: Environment from the Molecular Level: An e-science project for modelling the atomistic processes involved in environmental issues (funded by NERC)

Integrated Portal Architecture

Generic portal design using Globus and Web Services:

Visualisation

DataPortal

HPCPortal

HPC Systems

Data Systems

Web Services

Web Services

Web Services

Working with GGF Grid Computing Environments Research Group

GridFTP

GSI

Globus

Page 9: Environment from the Molecular Level: An e-science project for modelling the atomistic processes involved in environmental issues (funded by NERC)

Development Issues

• Constraints and other issues:– Project divided from outset into:

• development team; • application team; • science team.

– All teams work together and collaborate on papers– Tools written in C to integrate with existing “heritage”

applications, e.g. from the Collaborative Computational Projects (CCPs)

– Other interoperability issues addressed using Web services, e.g. gSOAP (client) +AXIS (server), XML-based data models and Semantic Grid technologies RDF+OWL

– Constraints: short term goals, no prior experience of e-Science, new technology must not disrupt current work.

– High requirements on computing resources for simulation studies• This lead to a focus on workflows for repeated calculations, data

management for storing and retrieving results, semantic Web technologies for data interoperability between codes

Page 10: Environment from the Molecular Level: An e-science project for modelling the atomistic processes involved in environmental issues (funded by NERC)

Evaluation• Papers presented at All Hands 2005 included:

– E-Science Usability: the e-Minerals Experience (paper 425)

– The e-Minerals Project: Developing the Concept of the Virtual Organisation to support Collaborative Work on Molecular-scale Environmental Simulations (paper 518)

• User engagement and evaluation:

– Looked at the Usability Task Force metrics.

– Our approach did not readily map onto them, but there are overlaps

– Key: understand the science users, their needs, and their natural ways of working.

– Good and bad points summarised on next slides

Page 11: Environment from the Molecular Level: An e-science project for modelling the atomistic processes involved in environmental issues (funded by NERC)

Lessons LearntWhat was usable?

– Keep it simple – use effective lightweight tools for the job

– Condor and Globus – Condor job scripts were accepted readily. Condor-G and DAGMan now used. RSL also embedded in scripts.

– SRB – required little training and was found to be useful, SCommands in scripts.

– Resource Management – Globus-based resource-monitoring tool was developed (in the Portal). A meta-scheduler is being developed.

– Security – GSI proved “easy for users to work with”. The Portal uses MyProxy to ensure pervasive access. Certificates were not a problem – we offered training from Day 1.

– Collaboration tools – desktop use of AG enables ad hoc meetings + MAST (Multi-cast Application Sharing Tool). Wiki and Instant Messaging also used.

– Semantic technologies. CML was initially used with XSLT and SVG. This now extended in the AgentX toolkit.

Page 12: Environment from the Molecular Level: An e-science project for modelling the atomistic processes involved in environmental issues (funded by NERC)

Lessons LearntWhat was not usable?

– Client tools * – installation has caused difficulties, e.g. Globus. Initially used “submit machines”. Solutions investigated include:

• Portal – hides the complexity behind a Web interface, user doesn’t install anything;

• Web service interfaces – for Condor (Chapman et al.), GROWL for Globus and SRB (Allan et al.);

• BPEL interface – work at UCL/ OMII – plug-in for Eclipse.

– Firewall issues – for both users and infrastructure – changes to rules lead to instability. Portal and Web services solve this problem for users.

– Meta-data – tools are available, but automatic harvesting required to avoid mistakes. RCommands developed to improve this, can be linked into the workflow scripts.

* A recent workshop “Lightweight Grid Computing” was held 2-3/5/06 at Losehill Hall. Attendees from GROWL, RealityGrid, Imperial College, e-Minerals, e-CCP… Transcript of discussions on usability issues is available giving more detailed information.

Page 13: Environment from the Molecular Level: An e-science project for modelling the atomistic processes involved in environmental issues (funded by NERC)

Future PlansCurrent and Future development plans:

– New tools are being developed, for instance recently the meta-data editor and RCommands were added to the suite .

– AgentX data-interoperability tools have been added from e-CCP extending the use of CML. Such work is now timely and illustrates how existing large codes, e.g. Siesta and GULP from CCP5 can be integrated easily with visualisation tools.

– Development staff also work on other projects and with other developers. E-Minerals tools are now being evaluated in other areas, e.g. Integrative Biology and e-CCP. There are key synergies and critical mass, sharing of experiences and code/ services.

– Full integration via a portal interface was not initially wanted, and also could not be achieved at the start of the project as the technology was not adequate (we tried PHP, now have JSR-168). This is now being re-visited as it provides a good solution to many of the problems highlighted.

– Re-usable portlet-based tools from the NGS Portal can be re-used, already done for Integrative Biology and other projects. Can be combined with Wiki etc.

Some following slides show more details of some of the tools.

Blatant advert: Portals and Portlets 2006 http://www.nesc.ac.uk/esi/events/686/

Page 14: Environment from the Molecular Level: An e-science project for modelling the atomistic processes involved in environmental issues (funded by NERC)

MOLECULE

“Mol_frag_id”

ATOM

“Atom_frag_id”

xCoordinate

“xCoor_frag_id”

locator

locator

locator

O

0.000

0.000

0.000

H

0.000

0.757

0.587

H

0.000

-0.757

0.587

AgentX Framework - OverviewSpecify how to locate data (XML, CML, XLink) with a particular meaning

Applications can use tools (AgentX library) that work with the specification to obtain information

Classes and properties of entities are specified in an ontology(OWL, RDF/ XML)

Mappings (RDF/ XML) associate classes and properties with fragment identifiers(XPointer)

Fragment identifiers can be used to locate logical collections (classes) and data items (properties)

Ontology Mappings Data

Page 15: Environment from the Molecular Level: An e-science project for modelling the atomistic processes involved in environmental issues (funded by NERC)

AgentX Framework - Example

CONTROL

CONFIG.xml

Mappings

DL_POLY3

AgentX

core

Fortran

wrapper

Standard

Ontology

Standard

Mappings

AgentX

core

Python

wrapper

REVCON.xml

Mappings

CCP1 GUI

DL_POLY3 (CCP5) integrated with CCP1 GUI

AgentX

- Core library written in C

- Wrappers for Python, Perl and Fortran

- Hides the complexities of dealing with XML

- Simple API

- Enables straightforward exchange of information

Page 16: Environment from the Molecular Level: An e-science project for modelling the atomistic processes involved in environmental issues (funded by NERC)

RCommands

• RCommands are shell tools and associated Web services for meta-data manipulation

• RCommands primary use case is within e-Minerals workflow, i.e. to allow automatic insertion of meta-data as a post processing action

Function Domain RCommand

Authentication / Session

Rinit

Rexit

Rpasswd

Entity Operations

Rls

Rcreate

Rrm

Parameter Operations

Rannotate

Rsearch

Permissions Rchmod

Page 17: Environment from the Molecular Level: An e-science project for modelling the atomistic processes involved in environmental issues (funded by NERC)

RCommands Service-based Arch

RCommands

gSOAP

RCommand Server Code

JDBC

Axis

Relational Database

Client Side

Server Side

BPEL Engine

SOAP

Link into workflows

Page 18: Environment from the Molecular Level: An e-science project for modelling the atomistic processes involved in environmental issues (funded by NERC)

Subset of Schema

Name Value Pairs

• Title• Description• Notes• Start / End Dates• Originator

• Name• Description

• Name• URI

Page 19: Environment from the Molecular Level: An e-science project for modelling the atomistic processes involved in environmental issues (funded by NERC)

Royal Institution

University of

Reading