Top Banner
1 CLASSIFICATION OF PROTEIN STRUCTURES Comparing Protein Structures: Why? detect evolutionary relationships identify recurring motifs detect structure/function relationships predict function assess predicted structures classify structures -used for many purposes Structure is more conserved than sequence 28% sequence identity
24

CLASSIFICATION OF PROTEIN STRUCTUREShome.ku.edu.tr/~okeskin/CMSE520/lecture7.pdf · 1 CLASSIFICATION OF PROTEIN STRUCTURES Comparing Protein Structures: Why? • detect evolutionary

Dec 05, 2018

Download

Documents

hoangkhanh
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: CLASSIFICATION OF PROTEIN STRUCTUREShome.ku.edu.tr/~okeskin/CMSE520/lecture7.pdf · 1 CLASSIFICATION OF PROTEIN STRUCTURES Comparing Protein Structures: Why? • detect evolutionary

1

CLASSIFICATION OF PROTEIN STRUCTURES

Comparing Protein Structures: Why?

• detect evolutionary relationships • identify recurring motifs • detect structure/function relationships • predict function assess predicted structures • classify structures -used for many purposes

Structure is more conserved than sequence

28% sequence identity

Page 2: CLASSIFICATION OF PROTEIN STRUCTUREShome.ku.edu.tr/~okeskin/CMSE520/lecture7.pdf · 1 CLASSIFICATION OF PROTEIN STRUCTURES Comparing Protein Structures: Why? • detect evolutionary

2

Chain/Domain Library

Hundreds of thousands of gene sequences are translated to proteins (SwissProt, PIR)

~35,000 solved structures (PDB) as of March, 2006

Goals:Predict structure from sequencePredict function based on sequencePredict function based on structure

Page 3: CLASSIFICATION OF PROTEIN STRUCTUREShome.ku.edu.tr/~okeskin/CMSE520/lecture7.pdf · 1 CLASSIFICATION OF PROTEIN STRUCTURES Comparing Protein Structures: Why? • detect evolutionary

3

Angel R. Ortiz et al. Protein Sci 2002; 11: 2606-2621

Fig. 1. Examples of structural alignments obtained with MAMMOTH

(A) Alignment of 1pts_A with 1mup. The structural alignment score is 9.52;

(B) Structural alignment of 1pgb with 5tss_A. The score in this case is 6.29.

Page 4: CLASSIFICATION OF PROTEIN STRUCTUREShome.ku.edu.tr/~okeskin/CMSE520/lecture7.pdf · 1 CLASSIFICATION OF PROTEIN STRUCTURES Comparing Protein Structures: Why? • detect evolutionary

4

Page 5: CLASSIFICATION OF PROTEIN STRUCTUREShome.ku.edu.tr/~okeskin/CMSE520/lecture7.pdf · 1 CLASSIFICATION OF PROTEIN STRUCTURES Comparing Protein Structures: Why? • detect evolutionary

5

• Recognizing Structural Similarity

• GOAL: Of all solved structures, find the structure or substructure most similar to a protein of interest

• By eye -tried and true! requires an expert viewerwith a GREAT memory!

• Automated detection -good for database searching

• How would you do this?

Features of automated structure comparison

1. What representation will you use for the protein?2. How will you assess structural similarity? 3. How will you search the possible comparisons? 4. How significant is a “hit”?

Page 6: CLASSIFICATION OF PROTEIN STRUCTUREShome.ku.edu.tr/~okeskin/CMSE520/lecture7.pdf · 1 CLASSIFICATION OF PROTEIN STRUCTURES Comparing Protein Structures: Why? • detect evolutionary

6

»Example: Superposition to minimize RMSD• 1. Define measure of similarityRMSD = {Σ|x-xj|2)/N}1/2• 2. Determine correspondence between residues of each protein

(e.g. by sequence alignment, or a guess) • 3. Align centers of mass • 4. Use matrix methods to solve for the rotation that gives minimal

RMSD (variety of methods available) • 5. Evaluate the resulting number • 6. Refine the alignment • 7. iterate

»Very useful. Commonly used for comparing similar structures.»But… Not a good choice when proteins are only partially similar. Why?»Also, points far from center of mass are weighted more heavily.

Algorithms for detecting structure similarity

Dynamic Programming -works on 1D strings -reduce problem to this-can’t accommodate topological changes-example: Secondary Structure Alignment Program (SSAP)3D Comparison/Clustering -identify secondary structure elements or fragments-look for a similar arrangement of these between different structures-allows for different topology, large insertions-example: Vector Alignment Search Tool (VAST)

Distance Matrix -identify contact patterns of groups that are close together-compare these for different structures-fast, insensitive to insertions-example: Distance ALIgnment Tool (DALI)

Unit vector RMS -map structure to sphere of vectors -minimize the difference between spheres -fast, insensitive to outliers -example: Matching Molecular Models Obtained from Theory (MAMMOTH)

Page 7: CLASSIFICATION OF PROTEIN STRUCTUREShome.ku.edu.tr/~okeskin/CMSE520/lecture7.pdf · 1 CLASSIFICATION OF PROTEIN STRUCTURES Comparing Protein Structures: Why? • detect evolutionary

7

Structural Classification of Proteins

• Structure vs. structure comparisons (e.g. using DALI) reveal related groups of proteins

• Structurally-similar proteins with detectable sequence homology are assumed to be evolutionarily related

• Similarities between non-homologous proteins suggest convergent evolution to a favorable or useful fold

• A number of different groups have proposed classification schemes – SCOP (by hand) – CATH (uses SSAP)–FSSP (uses Dali)

Classification of structures

SCOP: http://scop.mrc-lmb.cam.ac.uk/scop/(domains, good annotation)

CATH: http://www.biochem.ucl.ac.uk/bsm/cath/

CE: http://cl.sdsc.edu/ce.html

Dali Domain Dictionary: http://columba.ebi.ac.uk:8765/holm/ddd2.cgi

FSSP: http://www2.ebi.ac.uk/dali/fssp/(chains, updated weekly)

HOMSTRAD: http://www-cryst.bioc.cam.ac.uk/~homstrad/

HSSP: http://swift.embl-heidelberg.de/hssp/

Page 8: CLASSIFICATION OF PROTEIN STRUCTUREShome.ku.edu.tr/~okeskin/CMSE520/lecture7.pdf · 1 CLASSIFICATION OF PROTEIN STRUCTURES Comparing Protein Structures: Why? • detect evolutionary

8

SCOP Hierarchy of Structures

Class: upper hierarchy

Family:evolutionarily related with a significant sequence identity -2327 in SCOP

Superfamily:different families whose structural and functional features suggest common evolutionary origin -1294 in SCOP

Fold:different superfamilieshaving same major secondary structures insame arrangement and with same topological connections -800 in SCOP

Classification of structural data (SCOP)

Pennisi, E. (1998) Science 279, 978Hubbard et al. (1999) Nucleic Acids Res 254.

PIR web site: http://pir.georgetown.edu

605 folds

947 superfamily

1557 family

12794 protein

http://scop.mrc-lmb.cam.ac.uk/scop/

Page 9: CLASSIFICATION OF PROTEIN STRUCTUREShome.ku.edu.tr/~okeskin/CMSE520/lecture7.pdf · 1 CLASSIFICATION OF PROTEIN STRUCTURES Comparing Protein Structures: Why? • detect evolutionary

9

945 FOLDS

1539 SUPERFAMILIES

2845 FAMILIES

70859 DOMAINS

Statistics from July 2005

Page 10: CLASSIFICATION OF PROTEIN STRUCTUREShome.ku.edu.tr/~okeskin/CMSE520/lecture7.pdf · 1 CLASSIFICATION OF PROTEIN STRUCTURES Comparing Protein Structures: Why? • detect evolutionary

10

Page 11: CLASSIFICATION OF PROTEIN STRUCTUREShome.ku.edu.tr/~okeskin/CMSE520/lecture7.pdf · 1 CLASSIFICATION OF PROTEIN STRUCTURES Comparing Protein Structures: Why? • detect evolutionary

11

Page 12: CLASSIFICATION OF PROTEIN STRUCTUREShome.ku.edu.tr/~okeskin/CMSE520/lecture7.pdf · 1 CLASSIFICATION OF PROTEIN STRUCTURES Comparing Protein Structures: Why? • detect evolutionary

12

Page 13: CLASSIFICATION OF PROTEIN STRUCTUREShome.ku.edu.tr/~okeskin/CMSE520/lecture7.pdf · 1 CLASSIFICATION OF PROTEIN STRUCTURES Comparing Protein Structures: Why? • detect evolutionary

13

Page 14: CLASSIFICATION OF PROTEIN STRUCTUREShome.ku.edu.tr/~okeskin/CMSE520/lecture7.pdf · 1 CLASSIFICATION OF PROTEIN STRUCTURES Comparing Protein Structures: Why? • detect evolutionary

14

Page 15: CLASSIFICATION OF PROTEIN STRUCTUREShome.ku.edu.tr/~okeskin/CMSE520/lecture7.pdf · 1 CLASSIFICATION OF PROTEIN STRUCTURES Comparing Protein Structures: Why? • detect evolutionary

15

Page 16: CLASSIFICATION OF PROTEIN STRUCTUREShome.ku.edu.tr/~okeskin/CMSE520/lecture7.pdf · 1 CLASSIFICATION OF PROTEIN STRUCTURES Comparing Protein Structures: Why? • detect evolutionary

16

Page 17: CLASSIFICATION OF PROTEIN STRUCTUREShome.ku.edu.tr/~okeskin/CMSE520/lecture7.pdf · 1 CLASSIFICATION OF PROTEIN STRUCTURES Comparing Protein Structures: Why? • detect evolutionary

17

Classification of Protein Structure: SCOP

http://scop.mrc-lmb.cam.ac.uk/scop/ http://scop.berkeley.edu/

Page 18: CLASSIFICATION OF PROTEIN STRUCTUREShome.ku.edu.tr/~okeskin/CMSE520/lecture7.pdf · 1 CLASSIFICATION OF PROTEIN STRUCTURES Comparing Protein Structures: Why? • detect evolutionary

18

Classification of Protein Structure: SCOPSCOP is organized into 4 hierarchical layers:

(1) Classes:

3) Superfamily: Probable common evolutionary originProteins that have low sequence identities, but whose structural and functional features suggest that a common evolutionary origin is probableare placed together in superfamilies

4) Family: Clear evolutionarily relationshipProteins clustered together into families are clearly evolutionarily related. Generally, this means that pairwise residue identities between the proteins are 30% and greater

Classification of Protein Structure: SCOP

(2) Folds: Major structural similarityProteins are defined as having a common fold if they have the same major secondary structures in the same arrangement and with the same topological connections

Page 19: CLASSIFICATION OF PROTEIN STRUCTUREShome.ku.edu.tr/~okeskin/CMSE520/lecture7.pdf · 1 CLASSIFICATION OF PROTEIN STRUCTURES Comparing Protein Structures: Why? • detect evolutionary

19

Classification of Protein Structure: SCOP

Classification of Protein Structure: CATH

http://www.biochem.ucl.ac.uk/bsm/cath/

Page 20: CLASSIFICATION OF PROTEIN STRUCTUREShome.ku.edu.tr/~okeskin/CMSE520/lecture7.pdf · 1 CLASSIFICATION OF PROTEIN STRUCTURES Comparing Protein Structures: Why? • detect evolutionary

20

Classification of Protein Structure: CATH

C

A

T

Alpha Mixed AlphaBeta Beta

Sandwich

Tim BarrelOther Barrel

Super RollBarrel

Page 21: CLASSIFICATION OF PROTEIN STRUCTUREShome.ku.edu.tr/~okeskin/CMSE520/lecture7.pdf · 1 CLASSIFICATION OF PROTEIN STRUCTURES Comparing Protein Structures: Why? • detect evolutionary

21

Classification of Protein Structure: CATH

Page 22: CLASSIFICATION OF PROTEIN STRUCTUREShome.ku.edu.tr/~okeskin/CMSE520/lecture7.pdf · 1 CLASSIFICATION OF PROTEIN STRUCTURES Comparing Protein Structures: Why? • detect evolutionary

22

Page 23: CLASSIFICATION OF PROTEIN STRUCTUREShome.ku.edu.tr/~okeskin/CMSE520/lecture7.pdf · 1 CLASSIFICATION OF PROTEIN STRUCTURES Comparing Protein Structures: Why? • detect evolutionary

23

The DALI Domain Dictionary

http://www.ebi.ac.uk/dali/domain/

The DALI Domain Dictionary

• All-against-all comparison of PDB90 using DALI

• Define score of each pair as a Z-score• Regroup proteins based on pair-wise

score:– Z-score > 2: “Folds”– Z-score >4, 6, 8, 10 : sub-groups of “folds”

(different from Families, and sub-families!)

Page 24: CLASSIFICATION OF PROTEIN STRUCTUREShome.ku.edu.tr/~okeskin/CMSE520/lecture7.pdf · 1 CLASSIFICATION OF PROTEIN STRUCTURES Comparing Protein Structures: Why? • detect evolutionary

24

Summary

• Classification is an important part of biology; protein structures are not exempt

• Prior to being classified, proteins are cut into domains

• While all structural biologists agree that proteins are usually a collection of domains, there is no consensus on how to delineate the domains

• There are three main protein structure classification:- SCOP (manual)

source of evolutionary information- CATH (semi-automatic)

source of geometric information- FSSP (automatic)

source of raw data