Protein Prediction Protein Prediction with Neural with Neural Networks! Networks! Chris Alvino Chris Alvino CS152 Fall ’06 CS152 Fall ’06 Prof. Keller Prof. Keller
Jan 18, 2018
Protein Prediction with Protein Prediction with Neural Networks!Neural Networks!
Chris AlvinoChris AlvinoCS152 Fall ’06CS152 Fall ’06
Prof. KellerProf. Keller
IntroductionIntroduction Proteins, made from amino acidsProteins, made from amino acids Polar forces interact for craaazzzy Polar forces interact for craaazzzy
combinatoric explosion!combinatoric explosion!
Just how crazzzzyyy?Just how crazzzzyyy?
Real CrazyReal Crazy Using crude workload estimates for a Using crude workload estimates for a
petaflop/second capacity machine petaflop/second capacity machine leads to an estimate of THREE YEARS leads to an estimate of THREE YEARS to simulate 100 MICROSECONDS of to simulate 100 MICROSECONDS of protein folding.protein folding.
Why Neural Nets?Why Neural Nets? Not so crazyNot so crazy Relatively accurate resultsRelatively accurate results
• 70-80% accurate70-80% accurate Patterns learned can lead to useful Patterns learned can lead to useful
biological databiological data Used to quickly check existing Used to quickly check existing
databasesdatabases
Early Methods: Black Box Early Methods: Black Box ApproachApproach
Protein Folding Analysis by an Protein Folding Analysis by an Artifical Neural Network ApproachArtifical Neural Network Approach
Authors: R. Sacile and C. RuggieroAuthors: R. Sacile and C. Ruggiero Published 1993Published 1993
Early Methods: Black Box Early Methods: Black Box ApproachApproach
Standard Back Prop AlgorithmStandard Back Prop Algorithm
Early Methods: Black Box Early Methods: Black Box ApproachApproach
3 Layers3 Layers• Input = Window size = 13 amino acidsInput = Window size = 13 amino acids• Hidden Layer = 20 neuronsHidden Layer = 20 neurons• Output Layer: 3 possible (alpha, beta, Output Layer: 3 possible (alpha, beta,
coil)coil)
Early Methods: Black Box Early Methods: Black Box ApproachApproach
7 training sets7 training sets• Each consists of around 1500 residuals Each consists of around 1500 residuals
(amino acids)(amino acids) Training took 3-4 hoursTraining took 3-4 hours
ResultsResults
Artificial Neural Networks and Hidden Markov Models for
Predicting the Protein Structures: The Secondary
StructurePrediction in Caspases
Thimmappa S. Anekonda(2002)
Current State of the ArtCurrent State of the Art Neural Networks and Hidden Markov Neural Networks and Hidden Markov
ModelsModels
Hidden Markov what?Hidden Markov what? Hidden Markov models (HMMs), originally
developed for other applications such as speech recognition, are generative, probabilistic models of sequential information.
An observed sequence is modeled as being the stochastic result of an underlying unobserved random walk through the hidden states of the model.
The parameters of an HMM are the transition probabilities between the hidden states and the symbol emission probabilities from each hidden state.
State transitions in a hidden Markov model State transitions in a hidden Markov model (example)(example)xx — hidden states — hidden statesyy — observable outputs — observable outputsaa — transition probabilities — transition probabilitiesbb — output probabilities — output probabilities
Caspases, the friendly GhostCaspases, the friendly Ghost Caspases are a family of intracellular
cysteine endopeptidases. They play a key role in inflammation
and mammalian apoptosis or programmed cell death.
Clash of the TitansClash of the Titans PHDSecPHDSec
• Utilizes evolutionary informationUtilizes evolutionary information PSIPREDPSIPRED
• Uses iterated PSI-BLAST profiles as input Uses iterated PSI-BLAST profiles as input instead of multiple sequeence alignments like instead of multiple sequeence alignments like PHDSecPHDSec
SAM-T02SAM-T02• Uses ANN and HMMUses ANN and HMM
PROF KingPROF King• Uses seven GOR-based predictions and ANNUses seven GOR-based predictions and ANN