1 __________________________________________________________________________________________________ 12/6/2013 GCBA 815 Tools and Algorithms in Bioinformatics GCBA815, Fall 2013 Week-14: Protein Structure and PTM Analysis Tools Babu Guda Department of Genetics, Cell Biology and Anatomy University of Nebraska Medical Center __________________________________________________________________________________________________ 12/6/2013 GCBA 815 Structural Bioinformatics
16
Embed
Tools and Algorithms in Bioinformatics · 1 _____ 12/6/2013 GCBA 815 Tools and Algorithms in Bioinformatics
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Human cancer-related protein (MDM2) with embedded small-molecule drug compounds (“nutlin”). MDM2 is shown as stick figures; “nutlin” is shown as small cyan colored spheres (van der Wall’s radii).
Picture taken from BayeNetwork
Binding of Drug compound to a cancer-related protein, MDM2
• The function of a biological macromolecule is highly dependent on its structural confirmation
• Deciphering the structure of DNA (double-helix) has revolutionized biological research
• Similarly, enzyme functions are highly specific that are regulated by proper orientation of their active sites
• While a lot of proteins act as enzymes, there are a number of structural proteins that support cellular and tissue-level infrastructure and aid in intra and inter cellular communication
Examples • Actin: Support the size, shape, structure and motion of cells • Cadherin: Adhesive proteins that glue cells together • Clathrin:Vesicular trafficking • Collagen: About 25% of all protein in our body • Integrins: On the cell surface, linking cells • Vaults: Symmetrical shells made of vault proteins
• PROSITE provides consensus patterns for a number of PTM sites. PTM modifications occur based on the structural or environmental context in the protein fold
• Because of this reason, methods based on regular expressions (regex) or local alignment methods produce large number of false positives
• In almost all methods used in PTM site prediction, artificial neural networks (ANNs) or HMMs are used.
• General procedure:
• Prepare datasets with experimentally-known PTM sites
• Separate the dataset into training and testing data
• Train a network using training data and test it with the test dataset. This process is iterated until the model is well refined
• Sufficient number of training sequences and good quality data are important for the success of any neural network method
Prediction of Phosphorylation Sites (NetPhos (http://www.cbs.dtu.dk/services/NetPhos/)
• Protein kinases, a very large family of enzymes that catalyze phosphorylation
• NetPhos produces neural network predictions for serine (S), threonine (T) or tyrosine (Y) phosphorylation sites in eukaryotic proteins that affect a multitude of cellular signaling processes
• Y-kinase Phosphorylation
• S or T-Phosphorylation in Caesin Kinase II
• Since these are very short patterns, the amino acids surrounding a phosphorylated residue are significant in determining whether a particular site can be phosphorylated or not
Prediction of Glycosylation Sites (NetNGlyc, NetOGlyc)
• Glycoproteins are specially synthesized molecules by covalent attachment of oligosaccharides to certain proteins at the ASN(N-glycosylation) or Serine or Threonine residues (O-glycosylation).
• These are usually exported to extra-cellular destinations like mucin in alimentary tract or glycoprotein harmones in the anterior Pitutory gland.
• Tyrosine (Y) sulfation is an important post-translational modification for proteins that go through the secretory pathway. It regulates several protein-protein interactions and modulates the binding affinity of TM peptide receptors
• Based on the rules described above, HMMs could be trained to build models for predicting proteins sequences with patterns that abide by these rules