Top Banner
1 BioGrids in the US: Current status and future opportunities Craig A. Stewart 15 April 2004 [email protected] Director, Research and Academic Computing Director, Information Technology Core, Indiana Genomics Initiative
19

1 BioGrids in the US: Current status and future opportunities Craig A. Stewart 15 April 2004 [email protected] Director, Research and Academic Computing Director,

Jan 01, 2016

Download

Documents

Rodger Grant
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: 1 BioGrids in the US: Current status and future opportunities Craig A. Stewart 15 April 2004 stewart@iu.edu Director, Research and Academic Computing Director,

1

BioGrids in the US: Current status and future opportunities

Craig A. Stewart

15 April 2004

[email protected]

Director, Research and Academic Computing

Director, Information Technology Core, Indiana Genomics Initiative

Page 2: 1 BioGrids in the US: Current status and future opportunities Craig A. Stewart 15 April 2004 stewart@iu.edu Director, Research and Academic Computing Director,

License Terms• Please cite this presentation as: Stewart, C.A. BioGrids in the US:

Current status and future opportunities. 2004. Presentation. Presented at: International School on Physics and Industry workshop on Particle Accelerators and Detectors: from Physics to Medicine (Ettore Majorana Foundation and Center for Scientific Culture, Erice, Italy, 15 Apr 2005). Available from: http://hdl.handle.net/2022/14780

• Portions of this document that originated from sources outside IU are shown here and used by permission or under licenses indicated within this document.

• Items indicated with a © are under copyright and used here with permission. Such items may not be reused without permission from the holder of copyright except where license terms noted on a slide permit reuse.

• Except where otherwise noted, the contents of this presentation are copyright 2004 by the Trustees of Indiana University. This content is released under the Creative Commons Attribution 3.0 Unported license (http://creativecommons.org/licenses/by/3.0/). This license includes the following terms: You are free to share – to copy, distribute and transmit the work and to remix – to adapt the work under the following conditions: attribution – you must attribute the work in the manner specified by the author or licensor (but not in any way that suggests that they endorse you or your use of the work). For any reuse or distribution, you must make clear to others the license terms of this work.

Page 3: 1 BioGrids in the US: Current status and future opportunities Craig A. Stewart 15 April 2004 stewart@iu.edu Director, Research and Academic Computing Director,

3

What is a grid?• A grid is a system including

• computational resources • data storage resources • visualization resources• specialized instruments • tied together by high-performance networks

• Why grids?– Transcend limits of location– Use resources that would otherwise not be

accessible– To do things that would otherwise not be

possible

Page 4: 1 BioGrids in the US: Current status and future opportunities Craig A. Stewart 15 April 2004 stewart@iu.edu Director, Research and Academic Computing Director,

4

Types of grids

By area of focus:• Collaboration grid• Computational grid

– Supercomputer grids

– Cycle scavenging• Data grids• Hybrid grids

Not included as part of this classification system: openness of software or organizational structure

Page 5: 1 BioGrids in the US: Current status and future opportunities Craig A. Stewart 15 April 2004 stewart@iu.edu Director, Research and Academic Computing Director,

5

Computational Grid: TeraGrid

• US key national grid effort• Based on Globus infrastructure• Attempts to solve grid technology challenges in a

very general fashion• Currently 9 sites• Little application thus far specifically in the area

of biology• Construction project: first we build it, then…

Page 6: 1 BioGrids in the US: Current status and future opportunities Craig A. Stewart 15 April 2004 stewart@iu.edu Director, Research and Academic Computing Director,

6

Special purpose Computational Grid: IU/HLRS 2003 HPC Challenge

• Global analysis of Arthropod evolution• One application: fastDNAml

• 8 types of systems; 641 processors; 6 continents

• 200 trees analyzed

Page 7: 1 BioGrids in the US: Current status and future opportunities Craig A. Stewart 15 April 2004 stewart@iu.edu Director, Research and Academic Computing Director,

7

Cycle scavenging Computational Grids

• Folding at home (www.stanford.edu/ group/pandegroup/ folding/)

• Fight AIDS at home (fightaidsathome. scripps.edu/)

• Evolution@home (www.evolutionary-research.net/)

http://www.stanford.edu/group/ pandegroup/folding/results.html

Page 8: 1 BioGrids in the US: Current status and future opportunities Craig A. Stewart 15 April 2004 stewart@iu.edu Director, Research and Academic Computing Director,

8

Data grids in biology

• Research data grids– Centralized Life Sciences Data Service– Teragrid (www.teragrid.org)

• Research and clinical data grids– SPIN (Shared Pathology Informatics Network)– Central Indiana Hospitals

Page 9: 1 BioGrids in the US: Current status and future opportunities Craig A. Stewart 15 April 2004 stewart@iu.edu Director, Research and Academic Computing Director,

9

Centralized Life Sciences Data Service at Indiana University

• Goal: transparent and integrated access to multiple data sources

• Federated database approach focuses on establishing glue between existing databases

• “Private” databases stay where they are – under local control

• “Public” databases may be replicated locally for performance

• Queries are entered as standard SQL

Page 10: 1 BioGrids in the US: Current status and future opportunities Craig A. Stewart 15 April 2004 stewart@iu.edu Director, Research and Academic Computing Director,

10

NR

EST

Swiss prot

BLAST Data sources

BLAST engine

CLSD Engine (IBM II)

LIGAND

BIND

ENZYME

dbSNP

Public data sources

MS SQL Server

IUSM workgroup databases

Custom Web Application Portal

Page 11: 1 BioGrids in the US: Current status and future opportunities Craig A. Stewart 15 April 2004 stewart@iu.edu Director, Research and Academic Computing Director,

11

CLSD: Finding Genes

• Queries multiple databases, linking expression data (local and remote) and location data

• Built by research lab in IUSM

• Portal, built with CLSD as a grid back end

Hereditary Diseases and Family Studies Division, Dept. of Medical and Molecular Genetics, IU School of Medicine. Supported in part by NIH R01 NS37167.

Page 12: 1 BioGrids in the US: Current status and future opportunities Craig A. Stewart 15 April 2004 stewart@iu.edu Director, Research and Academic Computing Director,

12

Understanding Microarray Data

The Microarray Data Portal was created by the Center for Medical Genomics at IU School of Medicine.Supported in part by the 21st Century Research & Technology Fund and the Indiana Genomics Initiative.The Indiana Genomics Initiative is supported in part by a grant from the Lilly Foundation, Inc.

Page 13: 1 BioGrids in the US: Current status and future opportunities Craig A. Stewart 15 April 2004 stewart@iu.edu Director, Research and Academic Computing Director,

13

Clinical data grids in Indiana

• SPIN (Shared Pathology Informatics Network)– Distributed database of anonymized data

about pathology specimens provides– Data in compliance with US privacy regulations– SPIN software runs at participating institutions

• Regenstrief Institute– From data vaults to data grids– Hundreds of millions of patient records– Clinical service grid serving central Indiana

hospitals

Page 14: 1 BioGrids in the US: Current status and future opportunities Craig A. Stewart 15 April 2004 stewart@iu.edu Director, Research and Academic Computing Director,

14

Semantic requirements for BioData Grids

• Interoperability of nomenclature and metadata a critical challenge!

• “A biologist would rather use another biologist’s toothbrush than another biologist’s terminology” – Thomas Kaufman

• Consistent semantics are required!• Example projects:

– GO: Gene Ontology– SBML: Systems Biology Markup Language– MAGE-ML: MicroArray Gene Expression Markup

Language – SNOMED – CT: SNOMED Clinical Terms

Page 15: 1 BioGrids in the US: Current status and future opportunities Craig A. Stewart 15 April 2004 stewart@iu.edu Director, Research and Academic Computing Director,

15

Hybrid Grids• SCrAPS

– Advanced Photon Source at Argonne National Laboratories

– “Better than being there” functionality– Real time integration of remote instruments,

collaboration, computation, and visualization– Near real time data movement

• BIRN– Key NIH funded biogrid– Includes data, computation, visualization

• Encyclopedia of Life• eDiamond

Page 16: 1 BioGrids in the US: Current status and future opportunities Craig A. Stewart 15 April 2004 stewart@iu.edu Director, Research and Academic Computing Director,

16

Where are we today?

By area of focus:• Collaboration grid• Computational grid

– Supercomputer grids

– Cycle scavenging• Data grids• Hybrid grids

By status:• Very general

construction projects• Handcrafted grid

solutions• Special projects

(heroic efforts involved)

• Ongoing production services

Page 17: 1 BioGrids in the US: Current status and future opportunities Craig A. Stewart 15 April 2004 stewart@iu.edu Director, Research and Academic Computing Director,

17

Looking Ahead• Access to computing power via grids is still

largely experimental• Access to data via grids has transformed

biomedical research and is transforming clinical practice

• Access to instruments is still experimental• Great opportunities to advance biomedical

research through use of grids Biology is different– Data is always collected somewhere– Affinities between grid structure and future

software structure• Sometimes grids are not the answer

Page 18: 1 BioGrids in the US: Current status and future opportunities Craig A. Stewart 15 April 2004 stewart@iu.edu Director, Research and Academic Computing Director,

18

Acknowledgments• This research was supported in part by the Indiana Genomics

Initiative. The Indiana Genomics Initiative of Indiana University is supported in part by Lilly Endowment Inc.

• This work was supported in part by Shared University Research grants from IBM, Inc. to Indiana University, and in particular by IU’s relationship with IBM as an IBM Life Sciences Institute of Innovation.

• This material is based upon work supported by the National Science Foundation under Grant No. 0116050 and Grant No. CDA-9601632. Any opinions, findings and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation (NSF).

• UITS staff: Mary Papakhian, Stephen Simms, Richard Repasky, Matt Link, John Samuel, Eric Wernert, Anurag Shankar, Andrew Arenson, John Herrin, Malinda Lingwall, W. Les Teach

Page 19: 1 BioGrids in the US: Current status and future opportunities Craig A. Stewart 15 April 2004 stewart@iu.edu Director, Research and Academic Computing Director,

19

Thank you!

Further information available at:http://about.uits.iu.edu/divisions/rac/cv_stewart.html