STRING Modeling of biological systems through cross-species data integration Lars Juhl Jensen EMBL Heidelberg
May 25, 2015
STRINGModeling of biological systems through
cross-species data integration
Lars Juhl JensenEMBL Heidelberg
Jensen et al., Drug Discovery Today: Targets, 2004
de Lichtenberg et al., Science, 2005
TOM
MRPLRibosome Related
MRPS
Vacuolar Acidification
Fatty Acid Biosynth.
Secondary RCC_Asy
RCC_Asy
RCCII
RCCIV
RCCVRCC_Asy
HAP Complex
Arg Biosynth.
PDH/KGD/GCV
Cell Wall & pH Reg.
DNA Repair
Glucose sensing and CH remodelling
APC
Fission/Fusion
rRNAProcessing
mRNAProcessing
TFIIICComplex
m-AAA Complex
TCA Cycle
Iron Homeostasis/Chaperone Activity
RCCI
rRNAProcessing
Leu/Val/IleBiosynth.
DNARepair
GARP Complex
Cytosolic Ribosome
TIM
RCC_Asy
Actin
tRNA Splicing
RCCIII
NUP
Replication/ DNA Repair
TOM
MRPLRibosome Related
MRPS
Vacuolar Acidification
Fatty Acid Biosynth.
Secondary RCC_Asy
RCC_Asy
RCCII
RCCIV
RCCVRCC_Asy
HAP Complex
Arg Biosynth.
PDH/KGD/GCV
Cell Wall & pH Reg.
DNA Repair
Glucose sensing and CH remodelling
APC
Fission/Fusion
rRNAProcessing
mRNAProcessing
TFIIICComplex
m-AAA Complex
TCA Cycle
Iron Homeostasis/Chaperone Activity
RCCI
rRNAProcessing
Leu/Val/IleBiosynth.
DNARepair
GARP Complex
Cytosolic Ribosome
TIM
RCC_Asy
Actin
tRNA Splicing
RCCIII
NUP
Replication/ DNA Repair
Protobacterialorthologs
TOM
MRPLRibosome Related
MRPS
Vacuolar Acidification
Fatty Acid Biosynth.
Secondary RCC_Asy
RCCII
RCCIVRCCV
RCC_Asy
HAP Complex
Arg Biosynth.
PDH/KGD/GCV
Cell Wall & pH Reg.
DNA Repair
Glucose sensing and CH remodelling
APC
Fission/Fusion
rRNAProcessing
mRNAProcessing
TFIIICComplex
m-AAA Complex
TCA Cycle
Iron Homeostasis/Chaperone Activity
RCCI
rRNAProcessing
Leu/Val/IleBiosynth.
DNARepair
GARP Complex
Cytosolic Ribosome
TIM
RCC_Asy
Actin
tRNA Splicing
RCCIII
NUP
Replication/ DNA Repair
Human diseaseorthologs
RCC_Asy
Genomic neighborhood
Species co-occurrence
Gene fusions
Database imports
Exp. interaction data
Microarray expression data
Literature co-mentioning
von Mering et al., Nucleic Acids Res., 2005
Restingprotuberances
Protractedprotuberance
Cellulose
© Trends Microbiol, 1999
CellCell wall
Anchoring proteins
Cellulosomes
Cellulose
The “Cellulosome”
Jensen et al., to appear in Nat. Rev. Genet., 2005
Networ-Kin™
Phospho-peptidedata (from MS)
Predict the kinase class(NetPhosK and Scansite)
Summary
• Quality control is crucial for large-scale data integration– The raw data sets from high-throughput experiments are dirty– Scoring, benchmarking, and filtering can greatly improve the
quality
• Data integration should be done across multiple species
• Automated literature mining methods are (finally) maturing– Restricted types of information can be extracted with high precision– Text mining methods can lead to novel discoveries
• Protein networks are more than just pretty pictures– Highly specific hypotheses can be made from high-quality
networks
Acknowledgments
• The STRING team– Christian von Mering
– Berend Snel
– Martijn Huynen
– Daniel Jaeggi
– Steffen Schmidt
– Sean Hooper
– Mathilde Foglierini
– Julien Lagarde
– Peer Bork
• Text mining project– Jasmin Saric
– Rossitza Ouzounova
– Isabel Rojas
• Networ-Kin– Rune Linding
– Tony Pawson
• Analysis of yeast mitochondria– Fabiana Perocchi
– Lars Steinmetz
• Analysis of yeast cell cycle– Ulrik de Lichtenberg
– Thomas Skøt
– Anders Fausbøll
– Søren Brunak
Thank you!