Top Banner
1 TM PRO & Comparison of Algorithms for “Protein Stability Prediction Upon Mutations” Madhavi Ganapathiraju Graduate student Carnegie Mellon University
23

TM PRO & Comparison of Algorithms for “Protein Stability Prediction Upon Mutations”

Jan 03, 2016

Download

Documents

jarrod-morrow

TM PRO & Comparison of Algorithms for “Protein Stability Prediction Upon Mutations”. Madhavi Ganapathiraju Graduate student Carnegie Mellon University. Overview. TMpro evaluations on PDBTM, TMPDB and MPTOPO are complete Additional inputs to TMPro are being studied - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: TM PRO &  Comparison of Algorithms for “Protein Stability Prediction Upon Mutations”

1

TM PRO&

Comparison of Algorithms for “Protein Stability Prediction Upon Mutations”

Madhavi GanapathirajuGraduate student

Carnegie Mellon University

Page 2: TM PRO &  Comparison of Algorithms for “Protein Stability Prediction Upon Mutations”

2

Overview

• TMpro evaluations on PDBTM, TMPDB and MPTOPO are complete

• Additional inputs to TMPro are being studied– Yule values (not successful)– Evolutionary Profile (promising)

• TMPro website has been completed• Evaluation of algorithms to predict protein

stability changes upon mutations

Page 3: TM PRO &  Comparison of Algorithms for “Protein Stability Prediction Upon Mutations”

3

Part 1: TM pro

Page 4: TM PRO &  Comparison of Algorithms for “Protein Stability Prediction Upon Mutations”

4

TMPro Evaluations

Segment Residuelevel

Method Qok SegmentF Score

Segment Recall

SegmentPrecision

Q2 Misclassified as

Soluble

MPtopo (101 TM proteins)

2a TMHMM 66 91 89 94 84 5

2b TMpro NN 60 93 92 94 79 0

PDBTM (191 TM proteins)

3a TMHMM 68 90 89 90 84 13

3b TMpro NN 57 93 93 93 81 2

Page 5: TM PRO &  Comparison of Algorithms for “Protein Stability Prediction Upon Mutations”

5

TMPro web-server

is fully functional!

Competition for TMpro

Logo

Prize:See your

logo on the web!

Page 6: TM PRO &  Comparison of Algorithms for “Protein Stability Prediction Upon Mutations”

6

Attempts to overcome confusion with globular soluble helices (1)

• Yule value features to be added– Yule value features that discriminate amino acid

neighbor propensities between TM and nonTM helices were computed earlier

– Tried to add these features as input to NN predictor, but could not achieve quantitative improvement

– I will discuss this in future when I have any results to present

Page 7: TM PRO &  Comparison of Algorithms for “Protein Stability Prediction Upon Mutations”

7

Attempts to overcome confusion with globular soluble helices (2)

• Evolutionary profile information– It is known that knowledge of evolutionary profile of a

protein can improve prediction accuracy to a great extent

• TMPro is capable of predicting TMs without requiring knowledge of profile– Useful when you cannot extract sequence

alignments from known proteins

• But where profile is known, we would like to use that additional information

Page 8: TM PRO &  Comparison of Algorithms for “Protein Stability Prediction Upon Mutations”

8

Profile generation

• Get multiple sequence alignments• Compute position specific scoring matrix for

each protein– 21 rows (20 amino acids, and 1 row for gaps)

• Profile is generated for each protein in the training and test sets

Those of you who have worked with evolutionary analysis before, please give feedback

PSSM (i,j) = log(C(i,j)/total counts at position j)log(C(i,j)/unigram count of i in the protein)

Page 9: TM PRO &  Comparison of Algorithms for “Protein Stability Prediction Upon Mutations”

9

Doubts

• We have labels for training sequences– But when original sequence has gaps when aligned,

how to interpret the labels of the gaps?

--n------n----n------nnn-----n------n-----------------M-----2a65 369 --D------E----L------KLS-----R------K-----------------H----- 3772A65_A 369 --.------.----.------...-----.------.-----------------.----- 377AAC07817 369 --.------.----.------...-----.------.-----------------.----- 377YP_001956 364 --E------S----F------G.K-----.------.-----------------T----- 372

-M------M------M------M-------M----------M---------MM-------2a65 378 -A------V------L------W-------T----------A---------AI------- 3852A65_A 378 -.------.------.------.-------.----------.---------..------- 385AAC07817 378 -.------.------.------.-------.----------.---------..------- 385YP_001956 373 -S------C------.-----------------------------------IL------- 377

Even TM regions are having gaps such as shown above

What labels to assign to gaps?

Page 10: TM PRO &  Comparison of Algorithms for “Protein Stability Prediction Upon Mutations”

10

Doubts

• When nothing is shown (gap/alignment) for some sequences, I am counting those as gaps

XP_659910 47 L-......K.----------...KAP----RSNQV.-..FVAGTMGLASAVGA.AT 86AAW43619 100 .....A..A-----------KNP----NTTRNV-..FMVGALGALGASSV.ST 136CAB59195 59 ----.N.RP.-A..VIGSARFAYMAWTRVA 83XP_466001 107 SKRA.-A.FVLSGGRFIYASLLRLL 130AAA20832 103 SKRA.-A.FVLTGGRFVYASLVRLL 126

What do with missing segment info for some sequences

Page 11: TM PRO &  Comparison of Algorithms for “Protein Stability Prediction Upon Mutations”

11

Using profile for predictionStudied independent of TMpro

Neural network with 21 input, 21 hidden and 1 output neurons

Residue Number

Pre

dic

ted

ou

tpu

t(n

on

me

mb

ran

e=

0,

me

mb

ran

e =

1)

Experimentalobserved locationsof TM helices

Page 12: TM PRO &  Comparison of Algorithms for “Protein Stability Prediction Upon Mutations”

12

Another output

Page 13: TM PRO &  Comparison of Algorithms for “Protein Stability Prediction Upon Mutations”

13

NN architecture needs to be modifiedBut instead I did post-processing of Neural network output

Computed Wavelet TransformMexican hat wavelet, scale = 10

Page 14: TM PRO &  Comparison of Algorithms for “Protein Stability Prediction Upon Mutations”

14

Some more wavelet outputs

Note that these are from the training data itself.. Yet to check how it performs overall

Page 15: TM PRO &  Comparison of Algorithms for “Protein Stability Prediction Upon Mutations”

15

Part 2: Stability upon Mutations

Page 16: TM PRO &  Comparison of Algorithms for “Protein Stability Prediction Upon Mutations”

16

Evaluation of predictions of protein stability changes upon mutations

• Effects of mutations on 2 TM proteins are available in our group– The two proteins are rhodopsin and

bacteriorhodopsin– Data available for how much mis-folding occurs– How stability of protein is affected

• There are algorithms that can also predict these changes

• We compared how accurate or reliable the prediction methods are, by comparing their results with our experimental data

Page 17: TM PRO &  Comparison of Algorithms for “Protein Stability Prediction Upon Mutations”

17

3 Prediction algorithms

• I mutant 2.0– Support vector machine– Features: amino acid neighbors in 9nm sphere,

temperature, pH, relative solvent accessibility surface are

– http://gpcr2.biocomp.unibo.it/cgi/predictors/I-Mutant2.0/I-Mutant2.0.cgi

• DFIRE– Knowledge based statistical potentials– http://phyyz4.med.buffalo.edu/hzhou/mutation.html

• FOLDX– Statistical mechanics.. Account for various energy terms– http://fold-x.embl-heidelberg.de:1100/

Page 18: TM PRO &  Comparison of Algorithms for “Protein Stability Prediction Upon Mutations”

18

Authors’ claims in 3 papers

Page 19: TM PRO &  Comparison of Algorithms for “Protein Stability Prediction Upon Mutations”

19

Our results

Number of known mutations I mutant DFIRE FOLD-X

Folding 52 54.7 57.7 50Meta 2 32 78.1 73.3 46.9Both 84 64.3 63.0 50.6

Number of known mutations I mutant DFIRE FOLD-X

Folding 147 35.4 37.1 55.7Meta 2 159 56.0 47.5 67.2Both 279 55.3 38.7 52.7

Rhodopsin (PDB: 1U19)

Bacteriorhodopsin (PDB: 1QM8)

Page 20: TM PRO &  Comparison of Algorithms for “Protein Stability Prediction Upon Mutations”

20

Bias in # of mutations that increase/decrease stability

Database bias affects apparent accuracies of algorithms

I-mutant for example, predicts decrease in stability for a majority of the mutations.

Whether the mutations studied through experiments preserve the natural bias of decreasing stability mutations, affects the apparent accuracy of the prediction algorithms

Experimental I-mutant DFIRE FOLDXRhodopsin 63 75 46 66Bacteriorhodopsin 81 97 81 65

Page 21: TM PRO &  Comparison of Algorithms for “Protein Stability Prediction Upon Mutations”

21

Correlation with known data

I-mutant DFIRE FOLDXRhodopsin 0.11 0.16 0.24Bacteriorhodopsin -0.09 0.18 -0.18

Reported correlations for these methods are quite large (>0.7)

On data compared here the correlations are quite low

Page 22: TM PRO &  Comparison of Algorithms for “Protein Stability Prediction Upon Mutations”

22

Notes ..

• Local installation of blast and netblast are on cologne:– /usr1/blast-2.2.13/ – /usr1/netblast-2.2.13/

• Java SDK on Cologne– /usr1/j2sdk1.4.2_11/

Page 23: TM PRO &  Comparison of Algorithms for “Protein Stability Prediction Upon Mutations”

23

Acknowledgements

Judith Klein-Seetharaman

Christopher Jon Jursa Pitt Information sciences

(for developing web interface)