Protein Structure Bioinformatics Session1: Introduction Rehab Ahmed CBSB, Faculty of Science, University of Khartoum Faculty of Pharmacy, University of Khartoum Introduction to Bioinformatics online course: IBT_2016 Introduction to Bioinformatics online course: IBT_2016 Protein Structural Bioinformatics, Trainer: Rehab Ahmed
74
Embed
Introduction to Bioinformatics online course: IBT 2016 ... · Protein Structure Bioinformatics Session1: Introduction ... Chou & Fasman Secondary Structure Prediction Server •
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Protein Structure Bioinformatics
Session1: IntroductionRehab Ahmed
CBSB, Faculty of Science, University of Khartoum
Faculty of Pharmacy, University of Khartoum
Introduction to Bioinformatics online course: IBT_2016
Introduction to Bioinformatics online course: IBT_2016 Protein Structural Bioinformatics, Trainer: Rehab Ahmed
Learning Objectives
• To recap some basics of amino acids and proteins
• To study the different levels of protein structures
• To shed light on how protein structures are
determined.
• To learn about some relevant databases, file formats
and file viewers.
Introduction to Bioinformatics online course: IBT_2016 Protein Structural Bioinformatics, Trainer: Rehab Ahmed
Learning Outcomes
By the end of this session and practical, students are
expected to be able to
• Explore some recourses, and tools in the PDB
database.
• Use some webservers to predict Protein secondary
structure
Introduction to Bioinformatics online course: IBT_2016 Protein Structural Bioinformatics, Trainer: Rehab Ahmed
Structure of Amino Acid
https://www.mun.ca/biology/scarr/iGen3_06-01.html
Introduction to Bioinformatics online course: IBT_2016 Protein Structure Bioinformatics, Trainer: Rehab Ahmed
• Programmed as a pattern-recognition process of hydrogen-bonded and geometrical features extracted from x-ray coordinates.
• Kabsch W, Sander C (1983). "Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features". Biopolymers. 22 (12): 2577–637
Introduction to Bioinformatics online course: IBT_2016 Protein Structure Bioinformatics, Trainer: Rehab Ahmed
DSSP (Helix, Strand and loops)
Introduction to Bioinformatics online course: IBT_2016 Protein Structure Bioinformatics, Trainer: Rehab Ahmed
Secondary structure Symbols
Alpha helix G
3-10 helix H
π-helix I
Beta bridge B
Beta strand E
Turns T
High curvature S
Space/no rule applies C
DSSP (Dictionary of protein secondary structure)
Introduction to Bioinformatics online course: IBT_2016 Protein Structure Bioinformatics, Trainer: Rehab Ahmed
Experimental determination of Secondary Structure
• Spectroscopy
• UV CD circular dichroism
• IR Spectroscopy
• NMR
Introduction to Bioinformatics online course: IBT_2016 Protein Structure Bioinformatics, Trainer: Rehab Ahmed
http://www.ap-lab.com/images/CD_STANDARDS.gif
Introduction to Bioinformatics online course: IBT_2016 Protein Structure Bioinformatics, Trainer: Rehab Ahmed
• For instance, helical propensity of residue type X
• Pα(X) = frequency (X in helix) / frequency (X)
• Pα > 1 = favours helix (e.g., Pα(Glu)=1.51)
• Pα < 1 = disfavours helix (e.g., Pα(Gly)=0.57)
Gerard J. Kleywegt’s slide
Introduction to Bioinformatics online course: IBT_2016 Protein Structure Bioinformatics, Trainer: Rehab Ahmed
Secondary structure prediction
• Database of 2000 residues
• 100 are Alanines
• 500 residues are in a helix
• 50 alanines are in a helix
• What is the propensity for Ala to be in a
• helix? Is Ala a good helix former?
Gerard J. Kleywegt’s slide
Introduction to Bioinformatics online course: IBT_2016 Protein Structure Bioinformatics, Trainer: Rehab Ahmed
Secondary structure prediction
• Pα(X) = frequency (X in helix) / frequency (X)
• Pα (Ala) = freq (Ala, α) / freq (Ala)
• freq (Ala, α) = 50/500 = 0.1
• freq (Ala) = 100/2000 = 0.05
• Pα (Ala) = 0.1/0.05 = 2.0
• Ala is a good helix former!
Gerard J. Kleywegt’s slide
Introduction to Bioinformatics online course: IBT_2016 Protein Structure Bioinformatics, Trainer: Rehab Ahmed
Secondary structure prediction
• Current, machine learning-based methods
employ information from multiple sequencealignment, information theory, and somemachine learning algorithms like artificial neuralnetwork and Bayesian networks or acombination of those.
• Eg: PSIPRED:
• http://bioinf.cs.ucl.ac.uk/psipred/
Introduction to Bioinformatics online course: IBT_2016 Protein Structure Bioinformatics, Trainer: Rehab Ahmed
submitted by biologists and biochemists from around the world.
freely accessible on the Internet via the websites of its member organizations.
Introduction to Bioinformatics online course: IBT_2016 Protein Structure Bioinformatics, Trainer: Rehab Ahmed
X-ray Crystallography
.
Introduction to Bioinformatics online course: IBT_2016 Protein Structure Bioinformatics, Trainer: Rehab Ahmed
X-ray Crystallography
Introduction to Bioinformatics online course: IBT_2016 Protein Structure Bioinformatics, Trainer: Rehab Ahmed
X-ray Crystallography
• According to the Online Dictionary of Crystallography the term resolution is used to describe the ability to distinguish between neighboring features in an electron density map
• R factor is one measure of model quality (The level of agreement between calculated and observed intensities). (0-0.6)
• >0.5 is considered of poor quality.
Introduction to Bioinformatics online course: IBT_2016 Protein Structure Bioinformatics, Trainer: Rehab Ahmed
X-ray Crystallography
Resolution Evaluation Interpretation
1.2 Å Excellent backbone and most side chains very clear. Some hydrogens may be resolved.
2.5 Å Good backbone and many side chains clear
3.5 Å OK! backbone and bulky side chains
5.0 Å Poor!!! backbone mostly clear; side chains not clear.
Introduction to Bioinformatics online course: IBT_2016 Protein Structure Bioinformatics, Trainer: Rehab Ahmed
Introduction to Bioinformatics online course: IBT_2016 Protein Structure Bioinformatics, Trainer: Rehab Ahmed
• Pfam 30.0
• 16306 entries (06.2016).
• Information about protein families (HMM)
• Annotations.
• links to other databases: RCSB PDB, CATH, SCOP, Proteopedia..etc
Pfam
Introduction to Bioinformatics online course: IBT_2016 Protein Structure Bioinformatics, Trainer: Rehab Ahmed
Pfam
Introduction to Bioinformatics online course: IBT_2016 Protein Structure Bioinformatics, Trainer: Rehab Ahmed
Pfam
Introduction to Bioinformatics online course: IBT_2016 Protein Structure Bioinformatics, Trainer: Rehab Ahmed
Pfam
Introduction to Bioinformatics online course: IBT_2016 Protein Structure Bioinformatics, Trainer: Rehab Ahmed
Pfam
Introduction to Bioinformatics online course: IBT_2016 Protein Structure Bioinformatics, Trainer: Rehab Ahmed
Pfam
Introduction to Bioinformatics online course: IBT_2016 Protein Structure Bioinformatics, Trainer: Rehab Ahmed
CATH
Introduction to Bioinformatics online course: IBT_2016 Protein Structure Bioinformatics, Trainer: Rehab Ahmed
The domains are classified within the CATH structural hierarchy: • Class (C) level, classification based on secondary
structure content, i.e. all alpha, all beta, a mixture of alpha and beta, or little secondary structure;
• Architecture (A) level, the level based on arrangement in three-dimensional space.
• Topology/fold (T) level, how the secondary structure elements are connected and arranged.
• Homologous superfamily (H) level, assignments are made if there is good evidence that the domains are related by evolution, i.e. they are homologous.