Top Banner
An Introduction to Bioinformatics Protein Structure Prediction
29

An Introduction to Bioinformatics

Feb 02, 2016

Download

Documents

duaa

An Introduction to Bioinformatics. Protein Structure Prediction. Aims. Understand the use of algorithms Recognize different approaches Understand the limitations. Objectives. Predict occurrence of aspects of structure To select appropriate tools. Introduction. - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: An Introduction to Bioinformatics

An Introduction to Bioinformatics

Protein Structure Prediction

Page 2: An Introduction to Bioinformatics

Aims

• Understand the use of algorithms

• Recognize different approaches

• Understand the limitations

Objectives

• Predict occurrence of aspects of structure

• To select appropriate tools

Page 3: An Introduction to Bioinformatics

Introduction

• Structure has several levels– 1 primary– 2 secondary– 3 tertiary– 4 quaternary

Page 4: An Introduction to Bioinformatics

1 primary

• Amino acid sequence

NH2-MRLSWYDPDFQARLTRSNSKCQGQLEV YLKDGWHMVC SQSWGRSSKQWEDPSQASKVCQRLNCGVPLSLGPFLVTYTPQSSIICYGQLGSFSNCSHSRNDMCHSLGLTCLE-COOH

Page 5: An Introduction to Bioinformatics

2 secondary

• Localized organisation -helices and -sheets

Page 6: An Introduction to Bioinformatics

3 tertiary

Three-dimensional organisation

Page 7: An Introduction to Bioinformatics

4 quaternary

Multi protein assembly

Page 8: An Introduction to Bioinformatics

The problem…..

• The best way is by X-ray crystallography or NMR etc…

• Structure databases only hold about 10,000 + structures

• Therefore devise programs to deduce structural solutions

• Complex!

Page 9: An Introduction to Bioinformatics

Secondary Structure prediction

• Signal peptides

• Intracellular targeting

•Trans-membrane -helices

• -helices and -sheets

•Super-secondary structure (motifs)

Page 10: An Introduction to Bioinformatics

Signal peptides

• Short N-terminal amino acid sequences

• Direct to membrane

• Cleaved after translocation

• SignalP – Nobel Prize 1999 Günter Blobel

Page 11: An Introduction to Bioinformatics

SignalP predicts signal peptide cleavage sites

Only first 50-70

Using neural networks

Page 12: An Introduction to Bioinformatics

Is the sequence a signal peptide?

# Measure Position Value Cutoff Conclusion max. C 25 0.910 0.37 YES max. Y 25 0.861 0.34 YES max. S 12 0.960 0.88 YES mean S 1-24 0.892 0.48 YES# Most likely cleavage site between pos. 24 and 25: SRA-LE

Page 13: An Introduction to Bioinformatics

Intracellular targeting

• TargetP

• Predict subcellular location of eukaryotic protein

• Presequences – Chloroplasts– Mitochondria– signal peptide

Page 14: An Introduction to Bioinformatics

Transmembrane Domains

• Lots of programs

• TMHMM -helices– hydrophobic – helix topology– R or K +ve charge cytoplasmic

side– Hidden Markov Modelling

Page 15: An Introduction to Bioinformatics

Paste as FASTA file

e.g Serotonin Receptor

Page 16: An Introduction to Bioinformatics

Predicts the transmembrane domains and orientation

Page 17: An Introduction to Bioinformatics

-helices and -sheets

• GOR algorithim• Assigns each residue to one conformational state of -helix, extended chain, reverse turn or coil• 64.4% accurate• Many other sites

• most use multiple alignments

Page 18: An Introduction to Bioinformatics

-helices and -sheets

10 20 30 40 50 60 70 | | | | | | |MKFSWRTALLWSLPLLVVGFFFWQGSFGGADANLGSNTANTRMTYGRFLEYVDAGRITSVDLYENGRTAIcccceeeeeecccceeeeeeeeccccccccccccccccccchhhhcceeeeccccceeeeeeccccceeeVQVSDPEVDRTLRSRVDLPTNAPELIARLRDSNIRLDSHPVRNNGMVWGFVGNLIFPVLLIASLFFLFRReeccccccchhhhccccccccchhhhhhhhhccccccccceecccceeeeecccccchhhhhhhhheeecSSNMPGGPGQAMNFGKSKARFQMDAKTGVMFDDVAGIDEAKEELQEVVTFLKQPERFTAVGAKIPKGVLLcccccccccchhhhcchhhhhhhhccceeeecchhhhhhhhhhhhhhhhhhcccchhhhhcccccceeeeVGPPGTGKTLLAKAIAGEAGVPFFSISGSEFVEMFVGVGASRVRDLFKKAKENAPCLIFIDEIDAVGRQRecccccchhhhhhhhhcccccceeecccccceeeeeecccchhhhhhhhhcccccceeeecchhhhccccGAGIGGGNDEREQTLNQLLTEMDGFEGNTGIIIIAATNRPDVLDSALMRPGRFDRQVMVDAPDYSGRKEIccccccccchhhhhhhhhhhhhcccccccceeeeeeccccchhhhhhccccccceeeeecccccccchhhLEVHARNKKLAPEVSIDSIARRTPGFSGADLANLLNEAAILTARRRKSAITLLEIDDAVDRVVAGMEGTPhhhhhhhhccccccchhhhccccccccchhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhheeeccccccLVDSKSKRLIAYHEVGHAIVGTLLKDHDPVQKVTLIPRGQAQGLTWFTPNEEQGLTTKAQLMARIAGAMGcccccccchhhhhcccceeeeeecccccccceeeecccccccceeccccccccchhhhhhhhhhhhhhhhGRAAEEEVFGDDEVTTGAGGDLQQVTEMARQMVTRFGMSNLGPISLESSGGEVFLGGGLMNRSEYSEEVAhhhhhhhcccccceeeccccchhhhhhhhhhhhhhhccccccccccccccceeeecccccccccchhhhhTRIDAQVRQLAEQGHQMARKIVQEQREVVDRLVDLLIEKETIDGEEFRQIVAEYAEVPVKEQLIPQLhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhcccccchhhhhhhhhhcccccccccccc

Page 19: An Introduction to Bioinformatics

Super-secondary Structure• Secondary structure elements

combined into specific geometric arrangements known as motifs

Beta corner

Page 20: An Introduction to Bioinformatics

Super-secondary Structure

Several programs/websites for specific domains e.g.

• PAIRCOIL and MULTICOIL - detect coiled-coiled regions– regions separating domains

• TRESPASSER - detects Leucine Zippers– Leu-X6-Leu-X6-Leu-X6-Leu protein interaction domain

• NPS@nalysis Helix-Turn-Helix– Protein interaction/DNA binding

Page 21: An Introduction to Bioinformatics

Integrated stucture prediction

• One stop shop!• Predict Protein at EBI

– secondary structure

– solvent accessibility globular regions

– transmembrane helices coiled-coil regions

– a multiple sequence alignment ProSite sequence motifs

– low-complexity retions

– ProDom domain assignments

Page 22: An Introduction to Bioinformatics

Tertiary Structure Prediction

• Homology modelling

• Fold recognition

• Threading

• Model building

Page 23: An Introduction to Bioinformatics

Protein sequence(primary structure)

Database searchingfor homologues

Homologue ofknown structure

No homologue ofknown structure

Comparativemodelling

3D-structure

Fold prediction,ab initio methods etc.

Page 24: An Introduction to Bioinformatics

Homology Modelling

• Method of choice following BLAST search

• SWISSModel is agood WWWInterface

URL: http://www.expasy.ch/swissmod/SWISS-MODEL.html

Page 25: An Introduction to Bioinformatics

• Requires at least one sequence of known 3D-structure with significant similarity to the target sequence.

• Compare the target sequence with database - FastA and BLAST.

• Sequences with a FastA score 10.0 standard deviations above the mean of the random scores or a P(N) lower than 10-5 (BLAST) considered for the model building

• Restrict to those which share at least 30% residue identity

Homology Modelling

Page 26: An Introduction to Bioinformatics

Homology Modelling

• Framework construction– compare atom positions - Cs

• Build non-conserved loops

• Complete backbone - add other atoms

• Add side chains

• Refine

Page 27: An Introduction to Bioinformatics

Insulin like gene from C.elegansRed = InsulinBlue = ILGF1

Page 28: An Introduction to Bioinformatics

What if I have no homologue?

Ab initio methods - Threading

• Sequence of unknown structure

• Thread through a through a sequence of known structure

• Move query sequence through residue by resudue and compare computationally

– include thermodynamic criteria, solvent accessibility, secondary structure information

• Computing intensive

Page 29: An Introduction to Bioinformatics

http://www.cs.bgu.ac.il/~bioinbgu/form.html