Top Banner
EBI is an Outstation of the European Molecular Biology Laboratory. PDBe-fold (SSM) A web-based service for protein structure comparison and structure searches Gaurav Sahni, Ph.D.
22

EBI is an Outstation of the European Molecular Biology Laboratory. PDBe-fold (SSM) A web-based service for protein structure comparison and structure searches.

Dec 29, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: EBI is an Outstation of the European Molecular Biology Laboratory. PDBe-fold (SSM) A web-based service for protein structure comparison and structure searches.

EBI is an Outstation of the European Molecular Biology Laboratory.

PDBe-fold (SSM)

A web-based service for protein structure

comparison and structure searches

Gaurav Sahni, Ph.D.

Page 2: EBI is an Outstation of the European Molecular Biology Laboratory. PDBe-fold (SSM) A web-based service for protein structure comparison and structure searches.

2

Structure alignment may be defined as identification of residues occupying “equivalent” geometrical positions

Unlike in sequence alignment, residue type is neglected

Used for measuring the structural similarity protein classification and functional analysis database searches

Structure alignment

Page 3: EBI is an Outstation of the European Molecular Biology Laboratory. PDBe-fold (SSM) A web-based service for protein structure comparison and structure searches.

Sequence and Structure Alignments

Sequence alignment

Based on residue identity, sometimes with a modified alphabet

--AARNEDDDGKMPSTF-LE-AARNFG-DGK--STFIL

Algorithms: Dynamic programming + heuristics

Applications: BLAST, FASTA, FLASH and others

Used for:

evolution studies protein function analysis guessing on structure similarity

Structure alignment

Based on geometrical equivalence of residue positions, residue type disregarded

Used for:

protein function analysis some aspects of evolution studies

Algorithms: Dynamic programming, graph theory, MC, geometric hashing and others

Applications: DALI, VAST, CE,MASS, SSM and others

Page 4: EBI is an Outstation of the European Molecular Biology Laboratory. PDBe-fold (SSM) A web-based service for protein structure comparison and structure searches.

Methods

Many methods are known:

Distance matrix alignment (DALI, Holm & Sander, EBI) Vector alignment (VAST, Bryant et. al. NCBI) Depth-first recursive search on SSEs (DEJAVU, Madsen & Kleywegt,

Uppsala) Combinatorial extension (CE, Shindyalov & Bourne, SDSC) Dynamical programming on C (Gerstein & Levitt) Dynamical programming on SSEs (SSA, Singh & Brutlag, Stanford

University) many more …

SSM employs a 2-step procedure:A Initial structure alignment and superposition using SSE graph matchingB C - alignment

Page 5: EBI is an Outstation of the European Molecular Biology Laboratory. PDBe-fold (SSM) A web-based service for protein structure comparison and structure searches.

Three dimensional graph matching

• Protein secondary structure elements (SSE)–

natural and convenient objects for building three

dimensional graphs.

• Secondary structures provide most functionality

and is conserved through evolution

• Details of protein fold –expressed in terms of two

SSE – helices and strands.

Page 6: EBI is an Outstation of the European Molecular Biology Laboratory. PDBe-fold (SSM) A web-based service for protein structure comparison and structure searches.

e

L

•SSE graphs- represented by vectors

•Each SSE can be used as graph vertices (Ti, ρi)

•Any 2 vertices are connected by an edge label L – describes position and orientation of the connected SSEs

•Each edge labelled with a property vector – α1/2 angle between edge and vertices, torsion angle between vertices, length of the edge L

Graph representation of SSEs

Vi

Vj

Page 7: EBI is an Outstation of the European Molecular Biology Laboratory. PDBe-fold (SSM) A web-based service for protein structure comparison and structure searches.

• Sets of vertices, edges and their labels provides full definition of the graph.

• Graph matching algorithm is required – set of rules for comparing individual vertices and edges – tolerances chosen empirically

• Relative and absolute vertex and edge lengths are used for comparison – allows larger absolute differences for longer vertices and edges

• Torsion angle comparison – distinguish mirror symmetry mates

e

L

Page 8: EBI is an Outstation of the European Molecular Biology Laboratory. PDBe-fold (SSM) A web-based service for protein structure comparison and structure searches.

H1

S1

S2S3

S4

H2

H1

H2 H3

H4

S1

H5

H6

S2

S3

S4 S5

S6

S7

H1

S1

S2

H2

S3

S4

S5

S6

S7

H3

H4

H5

H6

B

H1

S1

S2

S3

S4

H2

AA

B

Matching the SSE graphs yields a correspondence between secondary structure elements, that is, groups of residues. The correspondence may be used as initial guess for structure superposition and alignment of individual residues.

SSE graph matching

Page 9: EBI is an Outstation of the European Molecular Biology Laboratory. PDBe-fold (SSM) A web-based service for protein structure comparison and structure searches.

What next?

• We have considered three dimensional arrangement of

secondary structure element (SSE) regardless of their

ordering in protein chain.

• Connectivity of SSEs is significant (can be neglected in

comparing mutated/engineered proteins)

• In previous methods connectivity was either preserved or

neglected.

Page 10: EBI is an Outstation of the European Molecular Biology Laboratory. PDBe-fold (SSM) A web-based service for protein structure comparison and structure searches.

PDBefold (SSM) Approach – a more flexible way

• There are three options –

1) connectivity of SSEs neglected

Different

connectivity in

SSE but SSE

graphs are

geometrically

identical

Page 11: EBI is an Outstation of the European Molecular Biology Laboratory. PDBe-fold (SSM) A web-based service for protein structure comparison and structure searches.

2) Soft connectivity – general order of SSEs along their protein chains are same in both structures BUT any number of missing/unmatched SSE between matched ones allowed

3)Strict connectivity – matched SSEs follow same order along their protein chains – separated only by equal number of matched/unmatched SSE in both structures

• To obtain 3D alignment of individual residues – represent them by their C-alpha atoms – use results of graph matching as a starting point

Page 12: EBI is an Outstation of the European Molecular Biology Laboratory. PDBe-fold (SSM) A web-based service for protein structure comparison and structure searches.

SSE-alignment is used as an initial guess for C-alignment

C-alignment is an iterative procedure based on the expansion of shortest contacts at best superposition of structures

matched helices matched strands

chain A

chain B

C-alignment is a compromise between the alignment length

Nalign and r.m.s.d. Longest contacts are unmapped in order to

maximise the Q-score:

BA

align

NNRdsmr

NQ

20

2

....1

C - alignment

Page 13: EBI is an Outstation of the European Molecular Biology Laboratory. PDBe-fold (SSM) A web-based service for protein structure comparison and structure searches.

More than 2 structures are aligned simultaneously

Multiple alignment is not equal to the set of all-to-all pairwise alignments

Helps to identify common structure motifs for a whole family of structures

Multiple structure alignment

Page 14: EBI is an Outstation of the European Molecular Biology Laboratory. PDBe-fold (SSM) A web-based service for protein structure comparison and structure searches.

Macromolecular Structure Database31.10.0714

If you have to ask….

• Are there any structures in the PDB that are similar to mine?

• What SCOP and/or CATH family could my structure belong to ?

• Can I get some idea about the possible function of my protein based on similarity with others based on structural similarity ?

• Mutiple alignment of many of my structures ?

Use PDBefold.

Upload your own PDB file for analysis !!

Page 15: EBI is an Outstation of the European Molecular Biology Laboratory. PDBe-fold (SSM) A web-based service for protein structure comparison and structure searches.

SSM output Table of matched Secondary Structure Elements

Table of matched backbone C-atoms with distances between them at best structure superposition

Rotation-translation matrix of best structure superposition

Visualisation in Jmol and Rasmol

r.m.s.d. of C-alignment

Length of C-alignment Nalign

Number of gaps in C-alignment

Quality score Q

Statistical significance scores P(S), Z

Sequence identity

Page 16: EBI is an Outstation of the European Molecular Biology Laboratory. PDBe-fold (SSM) A web-based service for protein structure comparison and structure searches.

The PDBefold Search Interface

Page 17: EBI is an Outstation of the European Molecular Biology Laboratory. PDBe-fold (SSM) A web-based service for protein structure comparison and structure searches.

The Results Page For Pairwise Alignment

Page 18: EBI is an Outstation of the European Molecular Biology Laboratory. PDBe-fold (SSM) A web-based service for protein structure comparison and structure searches.

Analyzing the result from a particular pairwise alignment

Page 19: EBI is an Outstation of the European Molecular Biology Laboratory. PDBe-fold (SSM) A web-based service for protein structure comparison and structure searches.

Residue by Residue Structural alignment result

Page 20: EBI is an Outstation of the European Molecular Biology Laboratory. PDBe-fold (SSM) A web-based service for protein structure comparison and structure searches.

Multiple 3D alignment using PDBefold

Page 21: EBI is an Outstation of the European Molecular Biology Laboratory. PDBe-fold (SSM) A web-based service for protein structure comparison and structure searches.

Results from multiple 3D alignment

Page 22: EBI is an Outstation of the European Molecular Biology Laboratory. PDBe-fold (SSM) A web-based service for protein structure comparison and structure searches.

it is quite possible that residue identity plays a much less significant role in protein structure than often believed

as a consequence, the role of residue identity in protein function may be often overestimated

using sequence identity for the assessment of structural or functional features may give more false negatives than expected

physical-chemical properties of residues should be given preference over residue identity in structure and function analysis

modern methods for structure alignment are efficient; there is little sense to use sequence alignment in structure-related studies

Conclusion