Protein Structure Protein Structure Databases Databases Databases of three dimensional Databases of three dimensional structures of proteins, where structures of proteins, where structure has been solved using X- structure has been solved using X- ray crystallography or nuclear ray crystallography or nuclear magnetic resonance (NMR) techniques magnetic resonance (NMR) techniques Protein Databases: Protein Databases: PDB (protein data bank) PDB (protein data bank) Swiss-Prot Swiss-Prot PIR PIR (Protein Information Resource) SCOP (Structural Classification of SCOP (Structural Classification of Proteins) Proteins)
40
Embed
Protein Structure Databases Databases of three dimensional structures of proteins, where structure has been solved using X-ray crystallography or nuclear.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Protein Structure Protein Structure DatabasesDatabases
Databases of three dimensional structures Databases of three dimensional structures of proteins, where structure has been of proteins, where structure has been solved using X-ray crystallography or solved using X-ray crystallography or nuclear magnetic resonance (NMR) nuclear magnetic resonance (NMR) techniquestechniques
Protein Databases:Protein Databases: PDB (protein data bank)PDB (protein data bank) Swiss-ProtSwiss-Prot PIR PIR (Protein Information Resource) SCOP (Structural Classification of Proteins)SCOP (Structural Classification of Proteins)
Protein Structure Protein Structure DatabasesDatabases
Most extensive for 3-D structure is PDBMost extensive for 3-D structure is PDB
Visualization of ProteinsVisualization of Proteins A number of programs convert atomic A number of programs convert atomic
coordinates of 3-d structures into views of coordinates of 3-d structures into views of the moleculethe molecule
allow the user to manipulate the molecule allow the user to manipulate the molecule by rotation, zooming, etc.by rotation, zooming, etc.
Critical in drug design -- yields insight Critical in drug design -- yields insight into how the protein might interact with into how the protein might interact with ligands at active sitesligands at active sites
Visualization of ProteinsVisualization of Proteins
Most popular programs for viewing 3-D Most popular programs for viewing 3-D structures:structures:
Protein explorer: Protein explorer: http://www.umass.edu/microbio/chime/pe/protexpl/frntdoorhttp://www.umass.edu/microbio/chime/pe/protexpl/frntdoor.htm.htm
http://kinemage.biochem.duke.edu/website/kinhome.html http://kinemage.biochem.duke.edu/website/kinhome.html Swiss 3D viewer: http://www.expasy.ch/spdbv/mainpage.html Swiss 3D viewer: http://www.expasy.ch/spdbv/mainpage.html
Alignment of Protein Alignment of Protein StructureStructure
Compare 3D structure of one protein against Compare 3D structure of one protein against 3D structure of second protein3D structure of second protein
Compare positions of atoms in three-dimensional Compare positions of atoms in three-dimensional structuresstructures
Look for positions of secondary structural Look for positions of secondary structural elements (helices and strands) within a protein elements (helices and strands) within a protein domain domain
Exam distances between carbon atoms to Exam distances between carbon atoms to determine degree structures may be superimposeddetermine degree structures may be superimposed
Side chain information can be incorporatedSide chain information can be incorporated Buried; visibleBuried; visible
Structural similarity between proteins does not Structural similarity between proteins does not necessarily mean evolutionary relationshipnecessarily mean evolutionary relationship
Alignment of Protein Alignment of Protein StructureStructure
T
Simple case – two closely related proteins with the same number of amino acids.
Structure alignment
Find a transformationto achieve the best superposition
TransformationsTransformations
Translation
Translation and Rotation -- Rigid Motion (Euclidian space)
txx
'
txRx
'
Types ofTypes ofStructure Structure
ComparisonComparison
Sequence-dependent vs. sequence-Sequence-dependent vs. sequence-independent structural alignmentindependent structural alignment
Global vs. local structural alignmentGlobal vs. local structural alignment
Pairwise vs. multiple structural Pairwise vs. multiple structural alignmentalignment
Can be solved in O(n) time.Can be solved in O(n) time.
Useful in comparing structures of the Useful in comparing structures of the same protein solved in different methods, same protein solved in different methods, under different conformation, through under different conformation, through dynamics.dynamics.
Evaluation protein structure prediction.Evaluation protein structure prediction.
Sequence-independent Structure Comparison
Given two configurations of points in the three dimensional space:
find T which produces “largest” superimpositions of corresponding 3-D points.
Accurate secondary structure prediction Accurate secondary structure prediction can be an important information for the can be an important information for the tertiary structure predictiontertiary structure prediction
Protein function predictionProtein function prediction Protein classificationProtein classification Predicting structural changePredicting structural change An easier problem than 3D structure An easier problem than 3D structure
prediction (more than 40 years of history).prediction (more than 40 years of history).
helixhelix α-helix (30-35%)
Hydrogen bond between C=O (carbonyl) & NH (amine) groups within strand (4 positions apart)
3.6 residues / turn, 1.5 Å rise / residue Typically right hand turn Most abundant secondary structure α-helix formers: A,C,L,M,E,Q,H,K
Definition of secondary Definition of secondary structure of proteins structure of proteins
(DSSP)(DSSP) The DSSP codeThe DSSP code
H = alpha helix H = alpha helix B = residue in isolated beta-bridge B = residue in isolated beta-bridge E = extended strand, participates in beta E = extended strand, participates in beta
ladder ladder G = 3-helix (3/10 helix) G = 3-helix (3/10 helix) I = 5 helix (pi helix) I = 5 helix (pi helix) T = hydrogen bonded turn T = hydrogen bonded turn S = bend S = bend
CASP StandardCASP Standard H = (H, G, I), E = (E, B), C = (T, S)H = (H, G, I), E = (E, B), C = (T, S)
Given a protein sequence (primary Given a protein sequence (primary structure) structure) GHWIATHWIATRGQLIREAYEDYGQLIREAYEDYRHFSSSSECPFIP
Predict its secondary structure content (C=Coils H=Alpha Helix E=Beta Strands)
GHWIATHWIATRGQLIREAYEDYGQLIREAYEDYRHFSSSSECPFIP
CEEEEEEEEEECHHHHHHHHHHHHHHHHHHHHHHCCCHHHHCCCCCC
AlgorithmAlgorithm
Chou-Fasman MethodChou-Fasman Method
Examining windows of 5 - 6 Examining windows of 5 - 6 residues to predict structureresidues to predict structure
From PDB database, calculate the From PDB database, calculate the propensitypropensity for a given amino acid to adopt a for a given amino acid to adopt a certain ss-typecertain ss-type
(aa(aaii --- amino acid i, --- amino acid i, --- ss type) --- ss type) Example:
#Alanine=2,000, #residues=20,000, #helix=4,000, #Ala in helix=500
From PDB database, calculate the From PDB database, calculate the propensitypropensity for a for a given amino acid to adopt a certain ss-typegiven amino acid to adopt a certain ss-type
Example:#Ala=2,000, #residues=20,000, #helix=4,000, #Ala in helix=500
P(,aai) = 500/20,000, p(p(aai) = 2,000/20,000
P = 500 / (4,000/10) = 1.25
Chou-Fasman Chou-Fasman parametersparameters
Note: The parameters given in the textbook are Note: The parameters given in the textbook are 100*P100*Pii
Scan through the peptide and identify regions Scan through the peptide and identify regions where 4 out of 6 contiguous residues have P(H) > where 4 out of 6 contiguous residues have P(H) > 1.00. That region is declared an alpha-helix. 1.00. That region is declared an alpha-helix.
Extend the helix in both directions until a set of Extend the helix in both directions until a set of four contiguous residues that have an average P(H) four contiguous residues that have an average P(H) < 1.00 is reached. That is declared the end of the < 1.00 is reached. That is declared the end of the helix. helix.
If the segment defined by this procedure is longer If the segment defined by this procedure is longer than 5 residues and the average P(H) > P(E) for than 5 residues and the average P(H) > P(E) for that segment, the segment can be assigned as a that segment, the segment can be assigned as a helix. helix.
Repeat this procedure to locate all of the helical Repeat this procedure to locate all of the helical regions in the sequence. regions in the sequence.
T S P T A E L M R S T GP(H) 69 77 57 69 142 151 121 145 98 77 69 57
T S P T A E L M R S T GP(H) 69 77 57 69 142 151 121 145 98 77 69 57
InitiationInitiation
Identify regions where 4/6 have a P(H) >1.00 “alpha-helix nucleus”
PropagationPropagation
Extend helix in both directions until a set of four residues have an average P(H) <1.00.
T S P T A E L M R S T GP(H) 69 77 57 69 142 151 121 145 98 77 69 57
If the average P(H) > P(E) for that segment, the segment can be assigned as a helix.
P(H)=107.5%>P(E)=85.9%
PredictionPrediction
T S P T A E L M R S T GP(H) 69 77 57 69 142 151 121 145 98 77 69 57
Scan through the peptide and identify a region where Scan through the peptide and identify a region where 3 out of 5 of the residues have a value of P(E)>1.00. 3 out of 5 of the residues have a value of P(E)>1.00. That region is declared as a beta-sheet. That region is declared as a beta-sheet.
Extend the sheet in both directions until a set of four Extend the sheet in both directions until a set of four contiguous residues that have an average P(E) < 1.00 contiguous residues that have an average P(E) < 1.00 is reached. That is declared the end of the beta-sheet. is reached. That is declared the end of the beta-sheet.
Any segment of the region located by this procedure Any segment of the region located by this procedure is assigned as a beta-sheet if the average P(E)>1.05 is assigned as a beta-sheet if the average P(E)>1.05 and the average P(E)>P(H) for that region. and the average P(E)>P(H) for that region.
Any region containing overlapping alpha-helical and Any region containing overlapping alpha-helical and beta-sheet assignments are taken to be helical if the beta-sheet assignments are taken to be helical if the average P(H) > P(E) for that region. It is a beta sheet average P(H) > P(E) for that region. It is a beta sheet if the average P(E) > P(H) for that region. if the average P(E) > P(H) for that region.
Chou-Chou-FasmanFasman algorithm algorithm
Beta-turnBeta-turn To identify a bend at residue number j, To identify a bend at residue number j,
calculate the following value calculate the following value
Predict the secondary structure of Predict the secondary structure of the following protein sequence:the following protein sequence:Ala Pro Ala Phe Ser Val Ser Leu Ala Ser Gly AlaAla Pro Ala Phe Ser Val Ser Leu Ala Ser Gly Ala