Top Banner
Functional Linkages between Proteins
36

Functional Linkages between Proteins. Introduction Piles of Information Flakes of Knowledge AGCATCCGACTAGCATCAGCTAGCAGCAGA CTCACGATGTGACTGCATGCGTCATTATCTA.

Dec 28, 2015

Download

Documents

Pamela Powers
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Functional Linkages between Proteins. Introduction Piles of Information Flakes of Knowledge AGCATCCGACTAGCATCAGCTAGCAGCAGA CTCACGATGTGACTGCATGCGTCATTATCTA.

Functional Linkages between Proteins

Page 2: Functional Linkages between Proteins. Introduction Piles of Information Flakes of Knowledge AGCATCCGACTAGCATCAGCTAGCAGCAGA CTCACGATGTGACTGCATGCGTCATTATCTA.

Introduction

Piles of Information Flakes of KnowledgeAGCATCCGACTAGCATCAGCTAGCAG

CAGACTCACGATGTGACTGCATGCGTCATTATCTAGTATGAAAAAAGCCATGCTAGGCTAGTCAGCGACATGAGCCATGACTAGCGCAGCATCAGTCATCAGTCAGCGGAGCGAGGAGAGAGAGACGACTGACTAGCATGCACACATGCATGACGTCATGACTGCATGACTGACTGACTGACTGCATGCATGATATTTTTTTTTTCATGCATGCAGCATGCTACCCAGCTACAGTGCACAGCAGGTACGACGCATCAGCATACGTACGGCATGACGACTCAGACTACGCATACGACTACGAC

E. Coli S. cerevisiaeDroso

phila

Page 3: Functional Linkages between Proteins. Introduction Piles of Information Flakes of Knowledge AGCATCCGACTAGCATCAGCTAGCAGCAGA CTCACGATGTGACTGCATGCGTCATTATCTA.

Data Analysis

Traditional Methods (Experiments & Sequence Homology) The function of a protein

New Computational MethodsFunctional linkages between proteins

Page 4: Functional Linkages between Proteins. Introduction Piles of Information Flakes of Knowledge AGCATCCGACTAGCATCAGCTAGCAGCAGA CTCACGATGTGACTGCATGCGTCATTATCTA.

What does Functional Linkage mean ?

1) A common structural complex

2) A common metabolic pathway

3) A common biological process

4) All answers are correct

Page 5: Functional Linkages between Proteins. Introduction Piles of Information Flakes of Knowledge AGCATCCGACTAGCATCAGCTAGCAGCAGA CTCACGATGTGACTGCATGCGTCATTATCTA.

New Computational Methods

Phylogenetic Profile Method Rosetta Stone Method Chromosomal Proximity Method COG Database

Page 6: Functional Linkages between Proteins. Introduction Piles of Information Flakes of Knowledge AGCATCCGACTAGCATCAGCTAGCAGCAGA CTCACGATGTGACTGCATGCGTCATTATCTA.

11

Phylogenetic Profile Method

Page 7: Functional Linkages between Proteins. Introduction Piles of Information Flakes of Knowledge AGCATCCGACTAGCATCAGCTAGCAGCAGA CTCACGATGTGACTGCATGCGTCATTATCTA.

Phylogenetic Profile Method

Biologically: Simliar profile likelihood for common pathway or complex

Mathematically: N genomes 2N possible profiles A unique characterization

Why Should it Work ?

Page 8: Functional Linkages between Proteins. Introduction Piles of Information Flakes of Knowledge AGCATCCGACTAGCATCAGCTAGCAGCAGA CTCACGATGTGACTGCATGCGTCATTATCTA.

Rosetta Stone Method

Page 9: Functional Linkages between Proteins. Introduction Piles of Information Flakes of Knowledge AGCATCCGACTAGCATCAGCTAGCAGCAGA CTCACGATGTGACTGCATGCGTCATTATCTA.

Rosetta Stone Method (= Domain Fusion Analysis) Interacting proteins have

homologs in another organism fused into a single protein chain

Page 10: Functional Linkages between Proteins. Introduction Piles of Information Flakes of Knowledge AGCATCCGACTAGCATCAGCTAGCAGCAGA CTCACGATGTGACTGCATGCGTCATTATCTA.

Rosestta Stone Method

Page 11: Functional Linkages between Proteins. Introduction Piles of Information Flakes of Knowledge AGCATCCGACTAGCATCAGCTAGCAGCAGA CTCACGATGTGACTGCATGCGTCATTATCTA.

Rosestta Stone Method

Experimentally: E. coli ~4300 proteins ~6800 pairs similar to a single protein

Biologically:

Why Should it Work ?

Page 12: Functional Linkages between Proteins. Introduction Piles of Information Flakes of Knowledge AGCATCCGACTAGCATCAGCTAGCAGCAGA CTCACGATGTGACTGCATGCGTCATTATCTA.

Rosestta Stone Method

Validation Tests(E. coli):1) Annotation of proteins from the

SWISS-PROT database (68% vs. 15%)

2) Database of Interacting Proteins (6.4%)

3) Phylogenetic Profile Method (5% vs. 0.6%)

Page 13: Functional Linkages between Proteins. Introduction Piles of Information Flakes of Knowledge AGCATCCGACTAGCATCAGCTAGCAGCAGA CTCACGATGTGACTGCATGCGTCATTATCTA.

Models’ Success & Failure

+ -+ True

positiveFalse negative

- False positive

True negative

predicted

found

Page 14: Functional Linkages between Proteins. Introduction Piles of Information Flakes of Knowledge AGCATCCGACTAGCATCAGCTAGCAGCAGA CTCACGATGTGACTGCATGCGTCATTATCTA.

Rosestta Stone Method

False Negatives1) interactions that have evolved

through other mechanisms, i.e. there never was a fusion

2) The fused protein has disppeared during evolution

Page 15: Functional Linkages between Proteins. Introduction Piles of Information Flakes of Knowledge AGCATCCGACTAGCATCAGCTAGCAGCAGA CTCACGATGTGACTGCATGCGTCATTATCTA.

Rosestta Stone Method

False Positives1) Proteins have been fused to

regulate co-expression2) Can’t distinguish between binding

and non-binding homologs.3) Functional interaction rather than

a physical interaction

Page 16: Functional Linkages between Proteins. Introduction Piles of Information Flakes of Knowledge AGCATCCGACTAGCATCAGCTAGCAGCAGA CTCACGATGTGACTGCATGCGTCATTATCTA.

Rosestta Stone Method

Reducing Errors

Page 17: Functional Linkages between Proteins. Introduction Piles of Information Flakes of Knowledge AGCATCCGACTAGCATCAGCTAGCAGCAGA CTCACGATGTGACTGCATGCGTCATTATCTA.

Rosestta Stone Method

Reconstruction of metabolic pathways

Page 18: Functional Linkages between Proteins. Introduction Piles of Information Flakes of Knowledge AGCATCCGACTAGCATCAGCTAGCAGCAGA CTCACGATGTGACTGCATGCGTCATTATCTA.

Functional Protein Networks

Page 19: Functional Linkages between Proteins. Introduction Piles of Information Flakes of Knowledge AGCATCCGACTAGCATCAGCTAGCAGCAGA CTCACGATGTGACTGCATGCGTCATTATCTA.

Orthologs vs. Paralogs

Orthologs: genes in different species that evolved from a common ancestral gene by speciation

Paralogs: genes related by duplication within a genome

Page 20: Functional Linkages between Proteins. Introduction Piles of Information Flakes of Knowledge AGCATCCGACTAGCATCAGCTAGCAGCAGA CTCACGATGTGACTGCATGCGTCATTATCTA.

Chromosomal Proximity

Proximate Genes On the same strand Within 300 bp, or - Respective paralogs within 300 bp

Inferred link genes whose orthologs are close in

at least three phylogenetic groups

Page 21: Functional Linkages between Proteins. Introduction Piles of Information Flakes of Knowledge AGCATCCGACTAGCATCAGCTAGCAGCAGA CTCACGATGTGACTGCATGCGTCATTATCTA.

Chromosomal Proximity

Direct Link two proximate genes that are also

proximate in at least two other phylogenetic groups

Indirect Linkgenes whose orthologs are close in at least three other phylogenetic groups

Page 22: Functional Linkages between Proteins. Introduction Piles of Information Flakes of Knowledge AGCATCCGACTAGCATCAGCTAGCAGCAGA CTCACGATGTGACTGCATGCGTCATTATCTA.

Chromosomal Proximity

Page 23: Functional Linkages between Proteins. Introduction Piles of Information Flakes of Knowledge AGCATCCGACTAGCATCAGCTAGCAGCAGA CTCACGATGTGACTGCATGCGTCATTATCTA.

Chromosomal Proximity

Biologically: Conservation of proximity across multiple genomes Linked function

Logically: How likely is it that two genes are randomly proximate ?

Why Should it Work ?

Page 24: Functional Linkages between Proteins. Introduction Piles of Information Flakes of Knowledge AGCATCCGACTAGCATCAGCTAGCAGCAGA CTCACGATGTGACTGCATGCGTCATTATCTA.

Chromosomal Proximity

Method’s Reliability:

Page 25: Functional Linkages between Proteins. Introduction Piles of Information Flakes of Knowledge AGCATCCGACTAGCATCAGCTAGCAGCAGA CTCACGATGTGACTGCATGCGTCATTATCTA.

Chromosomal Proximity

1586 links were detected between ortholog families

KEGG: 80% in the same biological pathway

COG: 67% in the same functional category

Validation:

Page 26: Functional Linkages between Proteins. Introduction Piles of Information Flakes of Knowledge AGCATCCGACTAGCATCAGCTAGCAGCAGA CTCACGATGTGACTGCATGCGTCATTATCTA.

Chromosomal Proximity

Total validated links per genome

380 direct 352 inferred

Page 27: Functional Linkages between Proteins. Introduction Piles of Information Flakes of Knowledge AGCATCCGACTAGCATCAGCTAGCAGCAGA CTCACGATGTGACTGCATGCGTCATTATCTA.

Chromosomal Proximity

Page 28: Functional Linkages between Proteins. Introduction Piles of Information Flakes of Knowledge AGCATCCGACTAGCATCAGCTAGCAGCAGA CTCACGATGTGACTGCATGCGTCATTATCTA.

The COG Database

Clusters of Orthologous Groups COGs creation Each COG contains proteins that

have evolved from an ancestral protein

Page 29: Functional Linkages between Proteins. Introduction Piles of Information Flakes of Knowledge AGCATCCGACTAGCATCAGCTAGCAGCAGA CTCACGATGTGACTGCATGCGTCATTATCTA.

The COG Database

Current Numbers (2004) 43 Complete genomes 30 phylogenetic groups 2223 phylogenetic patterns 17 functional categories 3307 COGS 74059 proteins, 71% of total

Page 30: Functional Linkages between Proteins. Introduction Piles of Information Flakes of Knowledge AGCATCCGACTAGCATCAGCTAGCAGCAGA CTCACGATGTGACTGCATGCGTCATTATCTA.

The COG Database

Page 31: Functional Linkages between Proteins. Introduction Piles of Information Flakes of Knowledge AGCATCCGACTAGCATCAGCTAGCAGCAGA CTCACGATGTGACTGCATGCGTCATTATCTA.

The COG Database

Direct Information Annotation of Proteins

(group and individual) Phylogenetic Patterns Multiple Alignment

How can we use it ?

Page 32: Functional Linkages between Proteins. Introduction Piles of Information Flakes of Knowledge AGCATCCGACTAGCATCAGCTAGCAGCAGA CTCACGATGTGACTGCATGCGTCATTATCTA.

The COG Database

Detecting Missed Genes Patterns that contain all but one Mostly small proteins

How can we use it ?

Page 33: Functional Linkages between Proteins. Introduction Piles of Information Flakes of Knowledge AGCATCCGACTAGCATCAGCTAGCAGCAGA CTCACGATGTGACTGCATGCGTCATTATCTA.

The COG Database

Groups number growth

Are we approaching saturation ?

Page 34: Functional Linkages between Proteins. Introduction Piles of Information Flakes of Knowledge AGCATCCGACTAGCATCAGCTAGCAGCAGA CTCACGATGTGACTGCATGCGTCATTATCTA.

COG on the WWW

Page 35: Functional Linkages between Proteins. Introduction Piles of Information Flakes of Knowledge AGCATCCGACTAGCATCAGCTAGCAGCAGA CTCACGATGTGACTGCATGCGTCATTATCTA.

Reliability of the Methods

Major validation: Experimentally known linkages

Validation by “keyword recovery” search

Page 36: Functional Linkages between Proteins. Introduction Piles of Information Flakes of Knowledge AGCATCCGACTAGCATCAGCTAGCAGCAGA CTCACGATGTGACTGCATGCGTCATTATCTA.

references1) Eisenberg D, Marcotte EM, Xenarios I, Yeates TO. Protein function in

the post-genomic era. Nature. 2000 405:823-826. Review2) Marcotte EM, Pellegrini M, Ng HL, Rice DW, Yeates TO, Eisenberg D.

Detecting protein function and proteing protein interactions from genome sequences. Science. 1999 285:751-753.

3) Yanai I, Mellor JC, DeLisi C. Identifying functional links between genes using conserved chromosomal proximity. Trends Genet. 2002 18:176-179.

4) Tatusov RL, Natale DA, Garkavtsev IV, Tatusova TA, Shankavaram UT, Rao BS, Kiryutin B, Galperin MY, Fedorove ND, Koonin EV. The COG database: new developments in phylogenetic classification of proteins from complete genomes. Nucleic Acids Res. 2001 29:22-28.

5) Tatusov,R.L., Koonin,E.V. and Lipman,D.J. (1997) A genomic perspective on protein families. Science, 278, 631–637.

6) http://www.ncbi.nlm.nih.gov/COG