Top Banner
INFSO-RI-508833 Enabling Grids for E-sciencE www.eu-egee.org A perspective on life science grids in Europe Vincent Breton (CNRS-IN2P3, LPC Clermont-Ferrand) ISGC 2007 March 28th, 2007
19

A perspective on life science grids in Europe

Jan 13, 2016

Download

Documents

duman

A perspective on life science grids in Europe. Vincent Breton (CNRS-IN2P3, LPC Clermont-Ferrand) ISGC 2007 March 28th, 2007. Content. Introduction The life science grid ecosystem Embrace, BioinfoGRID, 2 contributions to life science activities on grids in Europe Conclusion. Introduction. - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: A perspective on life science grids in Europe

INFSO-RI-508833

Enabling Grids for E-sciencE

www.eu-egee.org

A perspective on life science grids in Europe

Vincent Breton (CNRS-IN2P3, LPC Clermont-Ferrand)

ISGC 2007

March 28th, 2007

Page 2: A perspective on life science grids in Europe

A perspective … – March 22nd, 2007 – V. Breton 2

Enabling Grids for E-sciencE

INFSO-RI-508833

Content

• Introduction

• The life science grid ecosystem

• Embrace, BioinfoGRID, 2 contributions to life science activities on grids in Europe

• Conclusion

Page 3: A perspective on life science grids in Europe

A perspective … – March 22nd, 2007 – V. Breton 3

Enabling Grids for E-sciencE

INFSO-RI-508833

Introduction

• In this talk, LIFE SCIENCE = bioinformatics and molecular biology

• Hurng-Chun Lee addressed already grid-enabled drug discovery– WISDOM workshop tomorrow

• Yannick Legré will address medical research later this afternoon

Page 4: A perspective on life science grids in Europe

A perspective … – March 22nd, 2007 – V. Breton 4

Enabling Grids for E-sciencE

INFSO-RI-508833

Needs of the life sciences community

• Biologists need growing capability to handle all the data relevant to their research topics– Design of complex analysis workflows– Knowledge management

• Bioinformaticians who are developing the IT services for the biologists need growing resources – to store, update, curate exponentially growing databases– To run increasingly complex algorithms on this growing data set– To build new databases exploiting the growing body of knowledge

• Biologists and bioinformaticians have therefore different needs– Biologists need high level environments and little resources– Bioinformaticians need large resources to develop and/or update the

services needed by the biologist

Page 5: A perspective on life science grids in Europe

A perspective … – March 22nd, 2007 – V. Breton 5

Enabling Grids for E-sciencE

INFSO-RI-508833

The life science community needs both e-science and grid infrastructures

• E-science focusses at creating new research environments for biologists – Use of the most recent information technologies (semantics,

ontologies) – Design of virtual laboratories where the biologist can run

experiments and manipulate the knowledge she/he is familiar with– Examples: MyGrid (UK) and VLe (Netherlands) -> T. Oinn talk

• Grid infrastructures provide ressources needed at different levels– to support bioinformaticians who maintain data bases accessed by

e-science environments (update, curate, store/duplicate)– To increase resources for e-science environments when needed– To enable specific heavy computing or data production projects

(Decrypthon)

Page 6: A perspective on life science grids in Europe

A perspective … – March 22nd, 2007 – V. Breton 6

Enabling Grids for E-sciencE

INFSO-RI-508833

The situation in Europe

• Several e-science projects are developing high level e-science environments under adoption by the biology community

• Grid infrastructures (EGEE, DEISA) are now providing robust computing and growing data management services

• Two projects are exploring the interface between e-science and grid infrastructures for life science: Embrace and BioinfoGRID

Page 7: A perspective on life science grids in Europe

7

Introduction

• EMBRACE is a EU-sponsored Network of Excellence aimed at enabling bioinformatics research through better operability of databases, servers, and services

Page 8: A perspective on life science grids in Europe

8

Example• You want to predict phosphorylation sites just outside • transmembrane helices in 1329 membrane proteins.

• Yesterday: • 1) Obtain software to predict transmembrane helices; • 2) Obtain software to predict phosphorylation sites;• 3) Install both programs; • 4) Write software that calls both programs; • 5) Write software that combines outputs and presents results.

• Tomorrow:

1) Import APIs for the two services;

2) Write software that combines outputs and presents results.

Page 9: A perspective on life science grids in Europe

9

Data

• EMBRACE includes nearly all European bioinformaticians with longstanding track-records in terms of providing databases, servers, and services.

• Data types that they will make available:

• DNA sequences,

• protein sequences, macromolecular structures, SNPs,

• expression information, alignments, untranslated regions,

• structure domains, protein families, literature, electron

• micrographs, orthologs, ORFs, genome annotation,

• proteomics patterns, GPCRs, protein interactions, nucleotid

Page 10: A perspective on life science grids in Europe

10

Software

• EMBRACE includes nearly all large European bioinformatics centers that all will make their servers, services, and computational tools available using the EMBRACE-GRID.

• Computational facilities that all European bioinformaticians will get at• their finger tips include: • DNA sequence analysis, • genome annotation, homology searches at sequence and • structure level, structure analysis, visualization, protein • sequence analysis, phylogeny, protein domain mapping, • pattern matching, HMM, neural nets, micro-arrays, workflow • management, text-mining, systems biology, database techno

Page 11: A perspective on life science grids in Europe

11

Contact

• EMBRACE is coordinated by Graham Cameron• and Kerstin Nyberg at the EBI.

• Peter Rice coordinates the content integration

• Alan Bleasby coordinates the tools integration

• Vincent Breton coordinates technology recommendation

• Erik Bongcam Rudloff coordinates the test cases

• Gert Vriend coordinates outreach and education

Page 12: A perspective on life science grids in Europe

12

App

lica

tion

Use

r in

terf

ace

App

lica

tion

inte

rfac

e

Recommendation: web service technology

Page 13: A perspective on life science grids in Europe

13

Embrace grid

• Develop standard web service interfaces to tools and data bases

• Provide a workbench (Taverna) to exploit these tools and data (-> T. Oinn talk)

• Support data base providers in accessing grid infrastructures for resources and grid services– Pioneers: Swiss Institute of Bioinformatics, CMBI

(Netherland)– Creation of an Embrace VO on EGEE

• Develop interfaces between the e-science environments and the grid infrastructure– Issues: web service interface to grid services

Page 14: A perspective on life science grids in Europe

14

Refinement of Protein structures

• Project led by Gert Vriend (CMBI, Niejmegen, NL)

• Goal: recalculate the 3D structures of the 40000 proteins stored in PDB with improved image reconstruction algorithm

• Estimated computing need: 4 CPU years

• Status: under deployment on the Embrace EGEE VO

Page 15: A perspective on life science grids in Europe

15http://www.bioinfogrid.eu FP6-026808: BioinfoGRID 05 Feb 2007 Brussels, Belgium

BioinfoGRID Objective

• BioinfoGRID project aims at deploying bioinformatics applications on existing grid infrastructures (EGEE)

Page 16: A perspective on life science grids in Europe

16http://www.bioinfogrid.eu FP6-026808: BioinfoGRID 05 Feb 2007 Brussels, Belgium

Grid-enabled in silico drug discovery

Chimioinformatics teams

Biology teams

Docking servicesMD service Annotation services

Bioinformatics teams

target

Chemistbiologist teams

hitsSelected hits

Grid service customers

Grid service providers

Grid infrastructure

Check point

Check point

Check point

In v

itro

tes

ts

Page 17: A perspective on life science grids in Europe

17http://www.bioinfogrid.eu FP6-026808: BioinfoGRID 05 Feb 2007 Brussels, Belgium

Molecular Dynamics (MD) simulation on the grid

• Choice of Amber as MD software– MMPBSA procedure developed by G. Rastelli (Univ. Modena)– Licensing issues adressed

One license per grid site deploying Amber Use of Amber restricted to grid users coming from institutes owning

a license Access granted to all the nodes where Amber is installed

• Deployment of MD calculations on EGEE– Reranking of the 2500 best hits coming out of the first WISDOM

data challenge on malaria in February– Selection of 100 best hits– In vitro tests to start next month at Chonnam National University

Page 18: A perspective on life science grids in Europe

A perspective … – March 22nd, 2007 – V. Breton 18

Enabling Grids for E-sciencE

INFSO-RI-508833

Conclusion

• Life science community is really involving two communities: molecular biologists and bioinformaticians

• Both communities have different needs– BIologists need high level e-science environments– Bioinformaticians need also growing resources to maintain,

updata and curate their data bases

• Several grid projects in Europe are targeting both communities– Embrace– BioinfoGRID

• Significant progresses are witnessed– Still, need to develop web service interface to grid infrastructures

to foster adoption

Page 19: A perspective on life science grids in Europe

A perspective … – March 22nd, 2007 – V. Breton 19

Enabling Grids for E-sciencE

INFSO-RI-508833

Les objectifs de l’exposé

• Expliquer l’articulation entre les projets en Europe• EGEE: l’infrastructure• Les liens avec la communauté de bioinformatique

– BioinfoGRID: utiliser la grille EGEE telle qu’elle est aujourd’hui. Cela signifie demander à des biologistes et des bioinformaticiens de s’adapter

– Embrace: mon role est de construire le pont entre la culture actuelle de la communauté et les grilles actuelles.

• Notion de grid ecosystem – Need for different services for the biology and bioinformatics

community– From MyGrid to EGEE

• Presentation d’EMbrace• Presentation de BioinfoGRID