Top Banner
Abhik Seal Phd Student Indiana University) (Researcher OSDD CSIR) Anshu Bhardwaj Scientist, OSDD Unit Council of Scientific & Industrial Research Delhi, India 23 rd March 2012, Washington DC http://www.osdd.net Open Source Drug Discovery CSIR-led Team India Consortium with Global Partnership Affordable Healthcare for All Cheminformatics and Open Source Drug Discovery: a case study in academic collaboration between the U.S. and India
70

Indo us 2012

May 10, 2015

Download

Education

Abhik Seal

OSDD Presentation at Washington DC
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Indo us 2012

Abhik Seal Phd Student Indiana University)

(Researcher OSDD CSIR)

Anshu Bhardwaj Scientist, OSDD Unit

Council of Scientific & Industrial Research Delhi, India

23rd March 2012, Washington DC http://www.osdd.net

Open Source Drug Discovery CSIR-led Team India Consortium with Global Partnership

Affordable Healthcare for All

Cheminformatics and Open Source Drug Discovery: a case study in academic collaboration between the

U.S. and India

Page 2: Indo us 2012

First Disease Target : Tuberculosis Tuberculosis (TB) is one of leading causes of fatality, ranking second only to HIV as the killer infectious disease of adults worldwide.

Source: http://www.globalhealthfacts.org/data/topic/map.aspx?ind=12

OSDD Focus : Tropical Neglected Diseases

At least one person in the world is newly infected with TB bacilli every second

Over 1000 deaths a day or 3 deaths every 2 mins

New TB cases 2010

Page 3: Indo us 2012

Countries that had reported at least one XDR-TB case by end March 2011

Argentina Bhutan France Japan Namibia Republic of Korea ThailandArmenia Cambodia Georgia Kazakhstan Nepal Republic of Moldova TogoAustralia Canada Germany Kenya Netherlands Romania TunisiaAustria Chile Greece Kyrgyzstan Norway Russian Federation UkraineAzerbaijan China India Latvia Pakistan Slovenia United Arab EmiratesBangladesh Colombia Indonesia Lesotho Peru South Africa United KingdomBelgium Czech Republic Iran (Islamic Rep. of) Lithuania Philippines Spain United States of AmericaBotswana Ecuador Ireland Mexico Poland Swaziland UzbekistanBrazil Egypt Israel Mozambique Portugal Sweden Viet NamBurkina Faso Estonia Italy Myanmar Qatar Tajikistan

Page 4: Indo us 2012

TB Drug Discovery

Page 5: Indo us 2012

It commemorates the discovery of TB bacillus (Mycobacterium tuberculosis) through sputum microscopy which is still the diagnostics used to detect TB! No progress whatsoever, and we are discussing 'network communications'

World TB Day is 24th March 2012

Page 6: Indo us 2012

Challenges with Drug Discovery of Neglected Diseases

• Lack of market incentives • TB is a complex disease – latency, relapse, resistance • Clinical trials take a long time & study of relapse

needs long follow up (up to 18months) • Patient access is not direct, is through government

agencies

Page 7: Indo us 2012

Conventional vs Open Innovation Approach to Drug Discovery

Corporate HQ R&D

Cancer R&D

Neurological Disorder

Packaging

Sales

Clinical Trial

… R&D

Diabetics

Production

Pre-Clinical Trial Formulation

Page 8: Indo us 2012

Conventional vs Open Innovation Approach to Drug Discovery

Research groups Industry collaboration Individual participation Open Data Sharing

Page 9: Indo us 2012

OSDD Process Flow

Clinical trials

Public Funding of Clinical Trials

Government of India commitment - $46 million

Page 10: Indo us 2012

Drug Target Identification

Virtual Screening

Chemical Synthesis/library

Screening/ Hit

identification

Hit to

Lead

Clinical Trials

Candidate

45

19

9

6

2

Status: OSDD Projects

Other projects aim to develop tools, databases and repositories for the OSDD community

1

September 2008…………………………………………………………………March 2012

Page 11: Indo us 2012

OSDD Platform

System Architecture

Collaborative tools to accelerate neglected diseases research” in the book “Collaborative Computational Technologies for Biomedical Research”. Wiley and Sons. 2011

Page 12: Indo us 2012

Gene/operon predictions

Gene Expression

Regulatory Elements

Variation and repeats

Orthologs

Drug targets

Pathway/ Networks

More than a Million Data Points are now “Linked”

Deeksha Bhartiya Nitin Kumar

Mtb Data

* This is representative set of post-genomics data available on TB Collaborator: Dr. Vinod Scaria

Post-genomics data on Mtb is ‘Linked’ from disparate resources

Page 13: Indo us 2012

s.no. Source Tracks

1 UCSC Genome Browser on Mycobacterium tuberculosis H37Rv 06/20/1998 Assembly 6

2 WebTb Operon Map

3 Argo Genome Browser not web based

4 PGBrowser: Pathogen Genome Browser 3

5 BioHealthBase 16

6 Ensembl ~15

7 Tbrowse 100

Comparison of Browsers

Page 14: Indo us 2012

DeekshaBhartiya

OpenLabNoteBook on SysBorgTB http://sysborgtb.osdd.net/bin/view/OpenLabNotebook/TBMapDataset

Deeksha Bhartiya Nitin Kumar

Page 15: Indo us 2012

From a mathematical point of view, to create an accurate model of a single mammalian cell may require generating and then solving somewhere between 100,000 to one million equations

Biology is complex !!

http://news.vanderbilt.edu/2011/10/robot-biologist/

The human brain can only process seven pieces of data at a time!!! Need automation & new

technology to address the complexity

Page 16: Indo us 2012

Literature

Annotation Tools

Genomic Databases

Curated Annotations

Raw Annotations

OSDD C2D Community

800+ Student Researchers

Collaborative Curation

Pathway/Interactome | Gene Ontology | Protein Structure/Fold | Glycomics| Immunome

The “Connect to Decode” Programme

Page 17: Indo us 2012

Community Curation!!

Wrong (mark in red)

Right (mark in green)

Online discussion

Working on the cloud..

Many eye balls, make the ‘bug’ shallow!!!

Page 19: Indo us 2012

Mtb Metabolome Map on Payao

Sub-map of the metabolic network on Payao

SBI developed customized plug ins for OSDD for generating the metabolic map

Page 20: Indo us 2012

C2D April 2010 – Onsite Activity

Page 21: Indo us 2012

iOSDD890 From Social Network to Biological Network

Page 22: Indo us 2012

OSDD Community Effort to Understand Mtb Biology

Page 23: Indo us 2012

Within weeks, 830 volunteered to re-annotate the entire M. tuberculosis genome. The work started in December 2009 and was completed by April 2010, packing nearly 300 man-years into 4 months!

Source: Munos B. Can Open-Source Drug R&D Repower Pharmaceutical Innovation? Clin Pharmacol Ther 2010;87:534–536

Source: Hiroaki Kitano Nature Chemical Biology 7, 323–326 (2011)

Social engineering for virtual 'big science' in systems biology

Page 24: Indo us 2012

Connect to Decode Phase II - Themes

Page 25: Indo us 2012

Large student community from colleges and university are Cloning, Expressing and Purifying selected Mtb genes

To clone and express select genes of Mycobacterium tuberculosis Open Access Repository of Mtb clones

More than 120 sequence confirmed clones are ready for distribution

http://sysborg2.osdd.net/group/sysborgtb/project-details/-/projects/show/3212

Page 26: Indo us 2012

OSDDChem: Open Chemistry Initiative

A Large number of molecules are being

submitted for screening

Page 27: Indo us 2012

Bhardwaj et al. Tuberculosis (Edinb). 2009 Sep;89(5):386-7

http://tbrowse.osdd.net

Computational Resources developed with Community participation

Bhardwaj et al. 2011 John Wiley & Sons, Inc.

Mtb essential genes database

TrapTB Mtb drug targets database

Chembio Toolkit Workflow engine with federated resources

AmPhyDB Antimycobacterial Phytomolecule Database

http://sysborg2.osdd.net

A Comprehensive database of Mtb transporters Mtb-Human Interaction Database

Page 28: Indo us 2012

Q. Find novel genes and mutations & map known drug resistance mutations on genome of an MDR-TB strain

Enabling Complex Computational Analysis For Experimental Biologists/Chemists

Page 29: Indo us 2012

Galaxy provides - Simplified GUI design Ease of integrating modules Fewer components for creating workflows Sharable workflows for better collaboration

Page 30: Indo us 2012

Get data customized for extracting files from open lab note book

Custom APIs for importing input files from OSDD’s open lab note books

Page 31: Indo us 2012

Workflows and the result of the workflows are stored as separate lab note books Lab note book has details of the experiments performed Results of one experiment may be invoked for analysis in another experiment All versions of the workflow and the results are stored Flexibility to execute nested workflows

Custom APIs for exporting results to OSDD’s Open lab note book

Page 32: Indo us 2012

Our Approach : Data & Tool integration

In addition to access heterogeneous sources of data like BioMart Central/UCSC Table Browser (http://genome.ucsc.edu/), Open lab note

book of http://sysborg2.osdd.net is interfaced with Galaxy

Standalone databases and tools Tools as web services:

• Web services can be added as tools in Galaxy • Extends the potential of galaxy workflows

The process

Identify the module

Search for the WSDL

Code for client

Write XML for Galaxy

Configure & Integrate to

Galaxy

Page 33: Indo us 2012

ChemBio toolkit : >300 Modules integrated by OSDD Community

S. No Resources Clients 1 KEGG: Kyoto Encyclopedia of Genes and Genomes 60 2 GetEntry: DDBJ sequence search by accessionID 43 3 GPSR : tools 33 4 PDB : Protein Data Bank 30 5 BioModel:mathematical models of biological DB 25 6 Gtps : Gene Trek in Prokaryote Space 8

7 WSDbfetch: retrieve entries from biological dbs using entry identifiers or accession no. 7

8 Gibv: Genome Information Broker for Viruses 7 9 DDBJ :DNA Data bank of Japan 7 10 Mafft: a multiple sequence alignment program 4 11 Fasta:- DDBJ database 4 12 Ensembl : maintains automatic annotation 4 13 VecScreen vector contamination 4 14 OMIM:Online Mendelian Inheritance in man 4 15 Gtop: Gene-product Informatics 3 16 GO: Gene Ontology 3 17 SPS : Splicing Profile based Score 2 18 GIBIS: Genome Information Broker for Insertion Sequence 1 19 RefSeq: database of sequence 1 20 GIB: Genome Information Broker 1 21 GIBEnv- DDBJ database 1 22 TxSearch: Database indexing & searching 1

Page 34: Indo us 2012

OSDD Community suggests tools for integration in Galaxy

Page 35: Indo us 2012

Pubchem Bioassay data

(approx. 100,000

molecules/ dataset

6000 descriptors/molecule

Successful Models

Screen PubChem

(30 million)

Data amplification: Cheminformatics

Potential Hits

o Down sizing and random validation require multiple calculation for validation of results o Cross validation up to 50+ time for each experiment

Page 36: Indo us 2012

C-DAC’s Garuda Grid – Indian Grid Computing Initiative

C-DAC is R&D organization under Ministry of Communication & Information

Technology, India

C-DAC’s Garuda Grid is targeted at providing a facility for the scientific community,

which would enable them to seamlessly access the distributed resources.

Compute Power of GARUDA: ~ 70TFs (6000

CPUs)

Currently there are 55 Garuda Partners

Has NKN (National Knowledge Network) connectivity at 10Gbps

Page 37: Indo us 2012

Features:

Customized Galaxy on GARUDA • Integrated with Grid Authentication mechanism - Indian Grid Certificate

Authority (IGCA)

• Integrated with Gridway Metascheduler - Job scheduling and management

• Integrated OSDD tools - Weka (for data mining) and Autodock (Virtual screening).

• Provided support to upload multiple input files as tar file

• Data libraries of OSDD community are uploaded and are shared by all users

• Integrated with PostgreSQL

Page 40: Indo us 2012

Garuda- Galaxy Job Submission - Flow

Garuda-OSDD Server

Galaxy GUI

1. User selects tool and Input parameters

Galaxy Job Manager

Gridway Job runner

3. Gridway job runner uses user’s Garuda proxy file for job submission

2. Based on Tool, it sends the job to the correct runner.

Internet

Page 41: Indo us 2012

Weka in Galaxy

Page 42: Indo us 2012

Garuda Usage by OSDD: Job Accounting

Page 43: Indo us 2012

High Performance Grid Computing for OSDD members

Page 44: Indo us 2012

Anshu Bhardwaj Council of Scientific & Industrial Research (CSIR),

India

Chintalapati Janaki, Center for Development of Advanced Computing (C-DAC),

India

www.osdd.net 25-26 May 2011

Customized Galaxy with applications as Web Services and on the Grid for Open Source Drug Discovery (OSDD)

A CSIR led team India consortium with global partnership for affordable healthcare

Page 45: Indo us 2012

“In the long history of human mankind those who have learned to collaborate and improvise most effectively have prevailed.” -- Charles Darwin

Page 46: Indo us 2012

Cheminformatics: a strong case for community collaborative science

There is now an incredibly rich resource of public information relating compounds, targets, genes, pathways, and diseases. Just for starters there is in the public domain information on:

~30 million compounds and ~500,000 bioassays (PubChem, ChemSpider) ~60 million compound bioactivities (PubChem Bioassay) ~5,000 drugs (DrugBank) ~9 million protein sequences (SwissProt) and ~60,000 3D structures (PDB) ~14 million human nucleotide sequences (EMBL) ~20 million life science publications (PubMED) Multitude of other sets (drugs, toxicogenomics, chemogenomics, metagenomics …)

Page 47: Indo us 2012

I have thus chosen ‘Cheminformatics’ to study the vast pool of chemical compounds much more in details and analyze so as to narrow down to potential drug candidate. With the unique combination of IT and Chemistry, I am confident that one can actually derive much more

meaningful information of a chemical entity on this earth. Rajdeep (BioIT) I am organic chemist. I prepared several organic molecules.We go for biological activity,

maximum times it gives negative result. But with help of informatics in chemistry we can predict molecular properties. We can replace many ligands or substituents or functional group easily. And we can design our desirable molecule. ---Chirupulo

I am doing my M.Pharm in pharmaceutical chemistry,and i like cheminformatics that i need

accurate results but soon....and i am really interested in molecular modelling...so I am here. --- Haffy manaf

Cheminformatics deals with information about chems. It combines tools and techniques of IT

for information about chemical entities at the finger tip on click of a mouse. Databases are available for properties of descriptors. Softwares help to calculate molecular properties. Cheminformatics thus come handy tool for learning chemistry.------ Dr Keshav Mohan

Community Speaks: What excites them about Cheminformatics

Page 48: Indo us 2012

• Access to Journals for Chemical Structures • Lack of proper communication systems other than skype • Lack of software tools for accelerated drug discovery • Need of high speed internet • Need more experts to teach/train community members • Proper time schedule of IU cheminformatics classes

Challenges in implementation of Cheminformatics projects

Page 49: Indo us 2012

Indiana University Initiatives (Prof David J Wild)

Cheminformatics Awareness

http://icep.wikispaces.com

Page 50: Indo us 2012

Association Search – visualize literature supported associations between any two entities (compound, drug, gene, pathway, disease, side effect). PLoS One, in press.

Semantic Link Association Prediction (SLAP) – find most highly associated entities (compound, drug, gene, pathway, disease, side effect) to any other entity, based on probabilistic weightings of graph edges based on public experimental datasets. Paper in preparation

BioLDA – find most highly associated entities to any other entity based on a complex topic model analysis of the literature (PubMed). PLoS One, 2011, 6 (3), e17243

See also: WENDI (J. Cheminf., 2010,2,6); Chemogenomic Explorer (BMC Bio. 2011,12,256), ChemLDA, ChemBioGrid (J. Chem. Inf. Model., 2007; 47(4) pp 1303-1307)

Tools Developed for Large Scale Bio-Chemical Data Minning

Page 51: Indo us 2012

OSDD virtual resources

Page 52: Indo us 2012

Cheminformatics

Curated molecule datasets

Cheminformatics Models

Data Mining and Analysis

HT Virtual screening

PubChem

ChEMBL

DrugBank

Experimental Assays

Community of About 400

Other Active Communities: •OSDD Women Scientists Forum •OSDD Junior Scientists Forum

Page 53: Indo us 2012

Ideal Case US-India Cheminformatics Collaboration

IU CCRG

Research

Education Industry partnerships OSDD

Wet lab research

Open cheminfo.

group

Many interested students

Page 54: Indo us 2012

Funding for research in U.S.

$1.3m NIH

$360,000 Eli Lilly $120,000 Pfizer

Funding for research in

osdd

$46m Govt

$0

But in order to sustain…?

Page 55: Indo us 2012

Most of the biologists and chemists do not use computational workflows for their analysis

Awareness about the advantages of using such workflow engines

The Community needs to be trained for using the workflows

The Community needs to be trained for integrating applications

Web services vs standalone applications – each have their own set of advantages and limitations

Developers of algorithms should be encouraged to report results in globally accepted standard formats with standard ontologies

What should be our approach to reach out and integrate?

Page 56: Indo us 2012

Assembly line for drug discovery

I Biological Repository

i. Open access clinical strains repository ii. Open access clone repository iii. Open access protein repository

II Chemical Repository i. Open access small molecule repository

III Open Screening Facility

I. Submit your compounds for anti-tuberculosis screening

OSDD Open Access Resources

Page 57: Indo us 2012

Inhibition of FAAL and FACL enzymes by acyl-sulfamoyl

analogues

O O

NNO

CF3

s12

s14 s15

Preclinical development of thiophene containing

trisubstituted methanes

• Five synthetic ‘thiophene containing trisubstituted methanes’, which showed a MIC of <1.56 µg/ml, no cytotoxicity in mammalian cells being synthesised in PPP Mode

Public Private Partnerships as Open Collaborative Endeavors to solve Scientific Challenges

Page 58: Indo us 2012

Collaboration with TB Alliance on Human Clinical Trials

PA-824 in combination with other drugs

Affordable Healthcare for All

Page 59: Indo us 2012

Systems Biology

Target based

approach Human Clinical Trials

Hit to Lead Ligand based

approach

Page 60: Indo us 2012

An Innovative Approach to Drug Discovery: A New Paradigm

Valu

e

Biology/ Genomics

Target Identification

Target Validation

Hit(s)

Validated/ Quality Lead

Optimised Candidate Drug

Clinical Trials

Registered Drug

Risk

High Risk, Innovation Driven Sphere Strategy-> Open Innovation with best minds from academia/ industry

Process Oriented – Strategy-> Industry CRO’s Participation

Strategy-> OSDD to support clinical trials in collaboration with pharma

Innovation Funnel

Drugs to be available without IP encumbrances

Page 61: Indo us 2012

Major International Collaborations

Cheminformatics and e-learning

Structural Interactome to predict Off-Site Interactions of Drug Candidates

Metabolic Map Network Generation

Page 62: Indo us 2012

Author, Angela Saini

Geek Nation: How Indian Science Is Taking Over The World

http://www.sunday-guardian.com/bookbeat/tour-of-indian-science-that-fails-to-see-full-picture

Page 63: Indo us 2012

Science 24 February 2012: Vol. 335 no. 6071 p. 909

NEWS FOCUS

Page 64: Indo us 2012

OSDD Portfolio

March 2012

Page 65: Indo us 2012

OSDD Community &

The Team Leaders

Not all are shown

Page 66: Indo us 2012

Target Validation

PPI Validation

Cloning of potential drug targets

Galaxy Integration with Grid

Some of the OSDD PIs

Mtb Systems Biology

Mtb Genome Analysis

OSDDChem

Email: [email protected] Skype: anshu.bhardwaj

Cheminformatics Community + E-learning

Page 67: Indo us 2012

OSDD : A Global Community - More than 5500 members from over 130 countries

Statistics as of March 2012

Page 68: Indo us 2012

Open Source Drug Discovery (OSDD) Model “Team India Consortium with International Participation”

Council of Scientific and Industrial Research (CSIR), India

Current Partners

Mycobacterium tuberculosis

Wiki Portal

Exchange of Ideas/Results Community Participation

Lead Molecules Drug

Contract Research Organisations

Academia & Hospitals

Open Synthesis and Exchange

of Knowledge

PRECLINICAL & CLINICAL TRIAL

Candidate Targets

in silico SCREENING

in vivo VALIDATION

Lead Organization

Page 69: Indo us 2012

Together we can … .. and we should !

Matt Smadley | Flickr.com

http://www.osdd.net http://c2d.osdd.net

http://sysborg2.osdd.net

Email: [email protected] [email protected] [email protected] Skype: anshu.bhardwaj

http://scienceopenscience.blogspot.com/2011/12/osdd-song.html