Network integration of heterogeneous data

Post on 18-Nov-2014

800 Views

Category:

Technology

2 Downloads

Preview:

Click to see full reader

DESCRIPTION

8th Course in Bioinformatics & Systems Biology for Molecular Biologists, Bertinoro di Romagna, Bertinoro, Italy, March 16-20, 2008

Transcript

Network integration of heterogeneous data

Lars Juhl JensenEMBL Heidelberg

association networks

STRING

STITCH

373 genomes

model organism databases

Ensembl

Genome Reviews

RefSeq

genomic context methods

phylogenetic profiles

Cell

Cellulosomes

Cellulose

conserved neighborhood

operons

bidirectional promoters

gene fusion

primary experimental data

expression profiles

GEOGene Expression Omnibus

expression compendia

protein interactions

yeast two-hybrid

affinity purification

genetic interactions

synthetic lethality

BioGRIDGeneral Repository for Interaction Datasets

IntAct

MINTMolecular Interactions Database

DIPDatabase of Interacting Proteins

BINDBiomolecular Interaction Network Database

HPRDHuman Protein Reference Database

literature mining

co-mentioning

statistical methods

NLPNatural Language Processing

Gene and protein names

Cue words for entity recognition

Verbs for relation extraction

[nxexpr The expression of [nxgene the cytochrome genes [nxpg CYC1 and CYC7]]]is controlled by[nxpg HAP1]

MEDLINE

SGDSaccharomyces Genome Database

The Interactive Fly

OMIMOnline Mendelian Inheritance in Man

good synonyms list

manual curation

orthographic variation

disambiguation

curated knowledge

complexes

MIPSMunich Information center

for Protein Sequences

Gene Ontology

pathways

KEGGKyoto Encyclopedia of Genes and Genomes

Reactome

PIDNCI-Nature Pathway Interaction Database

STKESignal Transduction Knowledge Environment

variable reliability

raw quality scores

conservation

reproducibility

not comparable

benchmarking

calibrate vs. gold standard

probabilistic scores

combine all evidence

P = 1-(1-P1).(1-P2).(1-P3)…

spread over many species

transfer by orthology

two modes

COG mode

protein mode

signaling network

NetworKIN

NetPhorest

phosphoproteomics

mass spectrometry

in vivo phosphosites

kinases are unknown

computational methods

sequence motifs

kinase families

overprediction

context

localization

expression

co-activators

scaffolders

association networks

the idea

NetworKIN

coverage

69 kinases

benchmarking

small-scale validation

ATM phosphorylates Rad50

Cdk1 phosphorylates 53BP1

high-throughput validation

multiple reaction monitoring

the future

more sequence motifs

NetPhorest

data organization

selection

benchmarking

179 kinases

89 SH2 domains

8 PTB domains

upstream signaling

downstream signaling

signaling pathways

Acknowledgments

STRING & STITCH– Christian von Mering– Michael Kuhn– Manuel Stark– Samuel Chaffron– Philippe Julien– Tobias Doerks– Jan Korbel– Berend Snel– Martijn Huynen– Peer Bork

Literature mining– Evangelos Pafilis– Jasmin Saric– Rossitza Ouzounova– Sean O’Donoghue– Isabel Rojas

NetworKIN & NetPhorest– Rune Linding– Martin Lee Miller– Gerard Ostheimer– Francesca Diella– Karen Colwill– Jing Jin– Pavel Metalnikov– Vivian Nguyen– Adrian Pasculescu– Jin Gyoon Park– Leona D. Samson– Nikolaj Blom– Rob Russell– Peer Bork– Søren Brunak– Michael Yaffe– Tony Pawson

http://larsjuhljensen.wordpress.com

top related