Top Banner
1 Bioinformatics Algorithms Protein Structure © Jeff Parker, 2009 I don't want to play golf. When I hit a ball, I want someoneelse to go chase it . - Rogers Hornsby
53

1 Bioinformatics Algorithms Protein Structure © Jeff Parker, 2009 I don't want to play golf. When I hit a ball, I want someoneelse to go chase it. - Rogers.

Dec 29, 2015

Download

Documents

Hilda Ray
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: 1 Bioinformatics Algorithms Protein Structure © Jeff Parker, 2009 I don't want to play golf. When I hit a ball, I want someoneelse to go chase it. - Rogers.

1

Bioinformatics AlgorithmsProtein Structure

© Jeff Parker, 2009

I don't want to play golf. When I hit a ball, I want someoneelse to go chase it. - Rogers Hornsby

Page 2: 1 Bioinformatics Algorithms Protein Structure © Jeff Parker, 2009 I don't want to play golf. When I hit a ball, I want someoneelse to go chase it. - Rogers.

2

Outline

Rich topic – I can only hope to hit some highlights

Protein structures

Protein Folding

Techniques

Chou-Fasman

HP Lattices

Geometric Hashing

Patchdock

Page 3: 1 Bioinformatics Algorithms Protein Structure © Jeff Parker, 2009 I don't want to play golf. When I hit a ball, I want someoneelse to go chase it. - Rogers.

3

Resources

www.youtube.com/watch?v=swEc_sUVz5I

There are many nearby videos

www.learner.org/courses/biology/units/proteo/images.html

Includes images and four short animated videos

www.chembio.uoguelph.ca/educmat/phy456/456lec01.htm

Nice overview of folding. Simple animation

webhost.bridgew.edu/fgorga/proteins/

Tutorial with Jmol models, including alpha helix and beta sheet

Page 4: 1 Bioinformatics Algorithms Protein Structure © Jeff Parker, 2009 I don't want to play golf. When I hit a ball, I want someoneelse to go chase it. - Rogers.

4

Sources

Polymer principles and protein folding, Ken Dill

Center on Polymer Interfaces and Macromolecular Assemblies, Stanford U.

Lecture notes from Walter Chazin, VanderbiltFolding@homeGeometric Hashing: an Overview, H. J. Wolfson,

Isidore Rigoutsos

Page 5: 1 Bioinformatics Algorithms Protein Structure © Jeff Parker, 2009 I don't want to play golf. When I hit a ball, I want someoneelse to go chase it. - Rogers.

5

Amino Acid Chains

Difference is in side chain

Those with mostly Carbon and Hydrogen are hydrophobic

Polar side chains often have Oxygen and Nitrogen

Third class are those side chains that are charged at normal pH.

NH2 C

R

COOH

H

amino acid

20 different typesof side chain

Page 6: 1 Bioinformatics Algorithms Protein Structure © Jeff Parker, 2009 I don't want to play golf. When I hit a ball, I want someoneelse to go chase it. - Rogers.

AnfinsenThe Central dogma says The Central dogma says Sequence specifies structureSequence specifies structureC. B. Anfinsen worked on ribonuclease, which degrades RNA into smaller

componentsHe observed that when denatured (unfolded) ribonuclease would no longer

function correctly, but would refold when allowedDenature – to “unfold” a protein back to random coil configuration

-mercaptoethanol – breaks disulfide bondsUrea or guanidine hydrochloride – denaturantAlso heat or pH

Anfinsen’s experimentsDenatured ribonuclease with urea, then removed ureaRibonuclease spontaneously regained enzymatic activityEvidence that it re-folded to native conformation

Page 7: 1 Bioinformatics Algorithms Protein Structure © Jeff Parker, 2009 I don't want to play golf. When I hit a ball, I want someoneelse to go chase it. - Rogers.

7

Protein Folding

The structure that a protein adopts is vital to it’s chemistry

Its structure determines which of its amino acids are exposed to carry out the protein’s function

Its structure determines what substrates it can react with

Page 8: 1 Bioinformatics Algorithms Protein Structure © Jeff Parker, 2009 I don't want to play golf. When I hit a ball, I want someoneelse to go chase it. - Rogers.

Blind Watchmaker's Paradox

The space of all possible sequences is enormous

The chance that a useful protein, such as insulin, could have been built by chance is miniscule

Thus Life did not arise by chance

Page 9: 1 Bioinformatics Algorithms Protein Structure © Jeff Parker, 2009 I don't want to play golf. When I hit a ball, I want someoneelse to go chase it. - Rogers.

Blind Watchmaker's Paradox

The space of all possible sequences is enormous

The chance that a useful protein, such as insulin, could have been built by chance is miniscule

Thus Life did not arise by chance

While the chance of hitting the precise sequence for insulin is small

However, there are many alternatives that would function as well

Page 10: 1 Bioinformatics Algorithms Protein Structure © Jeff Parker, 2009 I don't want to play golf. When I hit a ball, I want someoneelse to go chase it. - Rogers.

Determining Protein Structure

There are O(100,000) distinct proteins in the human proteome.

3D structures have been determined for 14,000 proteins, from all organisms

Includes duplicates with different ligands bound, etc.

Coordinates are determined by X-ray crystallographyX-ray crystallography

Page 11: 1 Bioinformatics Algorithms Protein Structure © Jeff Parker, 2009 I don't want to play golf. When I hit a ball, I want someoneelse to go chase it. - Rogers.

X-Ray Crystallography

~0.5mm

• The crystal is a mosaic of millions of copies of the protein.

• As much as 70% is solvent (water)!

• May take months (and a “green” thumb) to grow.

Page 12: 1 Bioinformatics Algorithms Protein Structure © Jeff Parker, 2009 I don't want to play golf. When I hit a ball, I want someoneelse to go chase it. - Rogers.

X-Ray diffraction

Image is averagedover:Space (many copies)Time (of the diffraction

experiment)

Page 13: 1 Bioinformatics Algorithms Protein Structure © Jeff Parker, 2009 I don't want to play golf. When I hit a ball, I want someoneelse to go chase it. - Rogers.

13

pdb

Page 14: 1 Bioinformatics Algorithms Protein Structure © Jeff Parker, 2009 I don't want to play golf. When I hit a ball, I want someoneelse to go chase it. - Rogers.

14

PDB

HEADER HORMONE 08-OCT-96 2HIU

TITLE NMR STRUCTURE OF HUMAN INSULIN IN 20% ACETIC ACID, ZINC-

TITLE 2 FREE, 10 STRUCTURES

COMPND MOL_ID: 1;

COMPND 2 MOLECULE: INSULIN;

COMPND 3 CHAIN: A;

COMPND 4 MOL_ID: 2;

COMPND 5 MOLECULE: INSULIN;

COMPND 6 CHAIN: B

SOURCE MOL_ID: 1;

SOURCE 2 ORGANISM_SCIENTIFIC: HOMO SAPIENS;

SOURCE 3 ORGANISM_COMMON: HUMAN;

SOURCE 4 ORGANISM_TAXID: 9606;

SOURCE 5 MOL_ID: 2;

Page 15: 1 Bioinformatics Algorithms Protein Structure © Jeff Parker, 2009 I don't want to play golf. When I hit a ball, I want someoneelse to go chase it. - Rogers.

15

PDB

ATOM 1 N GLY A 1 -6.132 6.735 1.016 1.00 0.00 N

ATOM 2 CA GLY A 1 -4.686 6.753 1.376 1.00 0.00 C

ATOM 3 C GLY A 1 -3.864 6.149 0.235 1.00 0.00 C

ATOM 4 O GLY A 1 -3.324 6.855 -0.593 1.00 0.00 O

ATOM 5 H1 GLY A 1 -6.407 5.776 0.726 1.00 0.00 H

ATOM 6 H2 GLY A 1 -6.697 7.020 1.840 1.00 0.00 H

ATOM 7 H3 GLY A 1 -6.302 7.398 0.232 1.00 0.00 H

ATOM 8 HA2 GLY A 1 -4.370 7.772 1.548 1.00 0.00 H

ATOM 9 HA3 GLY A 1 -4.531 6.170 2.272 1.00 0.00 H

ATOM 10 N ILE A 2 -3.761 4.849 0.186 1.00 0.00 N

Page 16: 1 Bioinformatics Algorithms Protein Structure © Jeff Parker, 2009 I don't want to play golf. When I hit a ball, I want someoneelse to go chase it. - Rogers.

16

Protein Folding

Proteins fold into the low energy conformation.Proteins begin folding during translation.Hydrophobic residues are buried in an interior

core to form an α helix.Alpha helices are found in sequences with

Ala, Leu, Met, Phe, Glu, Gln, Lys, Arg, His

Another common form is β sheets.Beta sheets are found in sequences rich in

Tyr, Trp, Ile Val, Thr, CysMolecular chaperones work to fold new proteins.

Page 17: 1 Bioinformatics Algorithms Protein Structure © Jeff Parker, 2009 I don't want to play golf. When I hit a ball, I want someoneelse to go chase it. - Rogers.

17

Alpha helix

Page 18: 1 Bioinformatics Algorithms Protein Structure © Jeff Parker, 2009 I don't want to play golf. When I hit a ball, I want someoneelse to go chase it. - Rogers.

18

Beta Sheets

Page 19: 1 Bioinformatics Algorithms Protein Structure © Jeff Parker, 2009 I don't want to play golf. When I hit a ball, I want someoneelse to go chase it. - Rogers.

19

Protein StructuresPrimary Structure

The order of amino acidsSecondary Structure

Local shape – alpha helix and beta sheetsTertiary Structure

Fully Folded ShapeQuaternary Structure

Combination of multiple components

Page 20: 1 Bioinformatics Algorithms Protein Structure © Jeff Parker, 2009 I don't want to play golf. When I hit a ball, I want someoneelse to go chase it. - Rogers.

20

Structure Prediction

Given a new protein, how can we find the structure? Three major methods are used

Comparative modeling – look for a homologue

Fold recognition

Look for regions characteristic of folds

Ab intio

Simulate the attractions between parts of peptide

Difficult – high dimension, and hard to get accurate models of all the forces

Page 21: 1 Bioinformatics Algorithms Protein Structure © Jeff Parker, 2009 I don't want to play golf. When I hit a ball, I want someoneelse to go chase it. - Rogers.

Protein Structure in 3 steps.

Amino-acid #1 Amino-acid #2

Peptide bond

Step 1. Two amino-acids together (di-peptide)

Page 22: 1 Bioinformatics Algorithms Protein Structure © Jeff Parker, 2009 I don't want to play golf. When I hit a ball, I want someoneelse to go chase it. - Rogers.

Step 2: Most flexible degrees of freedom:

Protein Structure in 3 steps.

Page 23: 1 Bioinformatics Algorithms Protein Structure © Jeff Parker, 2009 I don't want to play golf. When I hit a ball, I want someoneelse to go chase it. - Rogers.

23

PDB

REMARK 500 SUBTOPIC: TORSION ANGLES

REMARK 500

REMARK 500 TORSION ANGLES OUTSIDE THE EXPECTED RAMACHANDRAN REGIONS:

REMARK 500 (M=MODEL NUMBER; RES=RESIDUE NAME; C=CHAIN IDENTIFIER;

REMARK 500 SSEQ=SEQUENCE NUMBER; I=INSERTION CODE).

REMARK 500

REMARK 500 STANDARD TABLE:

REMARK 500 FORMAT:(10X,I3,1X,A3,1X,A1,I4,A1,4X,F7.2,3X,F7.2)

REMARK 500

REMARK 500 EXPECTED VALUES: GJ KLEYWEGT AND TA JONES (1996). PHI/PSI-

REMARK 500 CHOLOGY: RAMACHANDRAN REVISITED. STRUCTURE 4, 1395 - 1400

REMARK 500

REMARK 500 M RES CSSEQI PSI PHI

REMARK 500 1 SER A 9 -156.44 167.40

REMARK 500 1 CYS A 20 -112.62 -53.53

REMARK 500 1 CYS B 7 127.43 -19.39

Page 24: 1 Bioinformatics Algorithms Protein Structure © Jeff Parker, 2009 I don't want to play golf. When I hit a ball, I want someoneelse to go chase it. - Rogers.

24

Protein Structures

Page 25: 1 Bioinformatics Algorithms Protein Structure © Jeff Parker, 2009 I don't want to play golf. When I hit a ball, I want someoneelse to go chase it. - Rogers.

25

Not all pairs of angles possible

Some configurations lead to self intersections

Studied by Ramachandra

Page 26: 1 Bioinformatics Algorithms Protein Structure © Jeff Parker, 2009 I don't want to play golf. When I hit a ball, I want someoneelse to go chase it. - Rogers.

26

Insulin Ramachandran plot

http://www.fos.su.se/~pdbdna/input_Raman.html

Page 27: 1 Bioinformatics Algorithms Protein Structure © Jeff Parker, 2009 I don't want to play golf. When I hit a ball, I want someoneelse to go chase it. - Rogers.

Secondary Structure Prediction

Easier than folding

Current algorithms can prediction secondary structure with 70-80% accuracy

Chou, P.Y. & Fasman, G.D. (1974). Biochemistry, 13, 211-222.

Based on frequencies of occurrence of residues in helices and sheets

Count how many times amino acid has been observed in

alpha helix, beta sheet, or in turn (a, b, c)

How many times it has been seen in first, second, third and fourth position in a turn f(i), f(i+1)…

Build a table of probabilities…

Page 28: 1 Bioinformatics Algorithms Protein Structure © Jeff Parker, 2009 I don't want to play golf. When I hit a ball, I want someoneelse to go chase it. - Rogers.

28

Chou-Fasman ParametersName P(a) P(b) P(turn) f(i) f(i+1) f(i+2) f(i+3)

Alanine 142 83 66 0.06 0.076 0.035 0.058

Arginine 98 93 95 0.070 0.106 0.099 0.085

Aspartic Acid 101 54 146 0.147 0.110 0.179 0.081

Asparagine 67 89 156 0.161 0.083 0.191 0.091

Cysteine 70 119 119 0.149 0.050 0.117 0.128

Glutamic Acid 151 037 74 0.056 0.060 0.077 0.064

Glutamine 111 110 98 0.074 0.098 0.037 0.098

Glycine 57 75 156 0.102 0.085 0.190 0.152

Histidine 100 87 95 0.140 0.047 0.093 0.054

Isoleucine 108 160 47 0.043 0.034 0.013 0.056

Leucine 121 130 59 0.061 0.025 0.036 0.070

Lysine 114 74 101 0.055 0.115 0.072 0.095

Methionine 145 105 60 0.068 0.082 0.014 0.055

Phenylalanine 113 138 60 0.059 0.041 0.065 0.065

Proline 57 55 152 0.102 0.301 0.034 0.068

Serine 77 75 143 0.120 0.139 0.125 0.106

Threonine 83 119 96 0.086 0.108 0.065 0.079

Tryptophan 108 137 96 0.077 0.013 0.064 0.167

Tyrosine 69 147 114 0.082 0.065 0.114 0.125

Valine 106 170 50 0.062 0.048 0.028 0.053

Page 29: 1 Bioinformatics Algorithms Protein Structure © Jeff Parker, 2009 I don't want to play golf. When I hit a ball, I want someoneelse to go chase it. - Rogers.

29

Chou-Fasman Algorithm

Identify -helices4 out of 6 contiguous amino acids that have P(a) > 100Extend the region until 4 amino acids with P(a) < 100 foundCompute P(a) and P(b); If the region is >5 residues and P(a) >

P(b) identify as a helixRepeat for -sheets [use P(b)]If an and a region overlap, the overlapping region is predicted

according to P(a) and P(b)

Page 30: 1 Bioinformatics Algorithms Protein Structure © Jeff Parker, 2009 I don't want to play golf. When I hit a ball, I want someoneelse to go chase it. - Rogers.

30

Chou-Fasman, cont’d

Identify hairpin turns:

P(t) = f(i) of the residue f(i+1) of the next residue f(i+2) of the following residue f(i+3) of the residue at position (i+3)

Predict a hairpin turn starting at positions where:

P(t) > 0.000075

The average P(turn) for the four residues > 100

P(a) < P(turn) > P(b) for the four residues

Accuracy 60-65%

Page 31: 1 Bioinformatics Algorithms Protein Structure © Jeff Parker, 2009 I don't want to play golf. When I hit a ball, I want someoneelse to go chase it. - Rogers.

Chou-Fasman Example

CAENKLDHVRGPTCILFMTWYNDGP

CAENKL – Potential helix (!C and !N)

Residues with P(a) < 100: RNCGPSTY

Extend: When we reach RGPT, we must stop

CAENKLDHV: P(a) = 972, P(b) = 843

Declare alpha helix

Identifying a hairpin turn

VRGP: P(t) = 0.000085

Average P(turn) = 113.25

Avg P(a) = 79.5, Avg P(b) = 98.25

Page 32: 1 Bioinformatics Algorithms Protein Structure © Jeff Parker, 2009 I don't want to play golf. When I hit a ball, I want someoneelse to go chase it. - Rogers.

Levinthal's Paradox

Consider a 100 residue protein. If each residue can take only 3 positions, there are 3100 = 5 1047 possible conformations.If it takes 10-13s to convert from 1 structure to another, exhaustive

search would take 1.6 1027 years!Folding must proceed by progressive stabilization of intermediatesHow can we find this path?

Page 33: 1 Bioinformatics Algorithms Protein Structure © Jeff Parker, 2009 I don't want to play golf. When I hit a ball, I want someoneelse to go chase it. - Rogers.

Levinthal's Paradox

Consider a 100 residue protein. If each residue can take only 3 positions, there are 3100 = 5 1047 possible conformations.If it takes 10-13s to convert from 1 structure to another, exhaustive

search would take 1.6 1027 years!Folding must proceed by progressive stabilization of intermediatesHow can we find this path?

May not be a single path: may be an energy landscape

Page 34: 1 Bioinformatics Algorithms Protein Structure © Jeff Parker, 2009 I don't want to play golf. When I hit a ball, I want someoneelse to go chase it. - Rogers.

Finding a global minimum in a multidimensional case is easy only when the landscape is smooth. No matter where you start (1, 2 or 3), you quickly end up at the bottom -- the Native (N), functional state of the protein.

Free e

nerg

y

Folding coordinate

1

2

3

Adopted from Ken Dill’s web site at UCSF

Page 35: 1 Bioinformatics Algorithms Protein Structure © Jeff Parker, 2009 I don't want to play golf. When I hit a ball, I want someoneelse to go chase it. - Rogers.

Adopted from Dobson, NATURE 426, 884 2003

Page 36: 1 Bioinformatics Algorithms Protein Structure © Jeff Parker, 2009 I don't want to play golf. When I hit a ball, I want someoneelse to go chase it. - Rogers.

Realistic landscapes are much more complex, with multiple local minima – folding traps.

Adopted from Ken Dill’s web site at UCSF

Page 37: 1 Bioinformatics Algorithms Protein Structure © Jeff Parker, 2009 I don't want to play golf. When I hit a ball, I want someoneelse to go chase it. - Rogers.

Adopted from Ken Dill’s web site at UCSF

Page 38: 1 Bioinformatics Algorithms Protein Structure © Jeff Parker, 2009 I don't want to play golf. When I hit a ball, I want someoneelse to go chase it. - Rogers.

Fold Optimization

Ken Dill – Insight was that Hydrophobic collapse was largest force

Simple lattice models (HP-models)

Classify residues as hydrophobic and polar

Use a lattice

Score a fold by the number of HH contacts

Page 39: 1 Bioinformatics Algorithms Protein Structure © Jeff Parker, 2009 I don't want to play golf. When I hit a ball, I want someoneelse to go chase it. - Rogers.

H/P model scoring: count noncovalent hydrophobic interactions.

Sometimes:Penalize for buried polar or surface hydrophobic residues

Scoring Lattice Models

Page 40: 1 Bioinformatics Algorithms Protein Structure © Jeff Parker, 2009 I don't want to play golf. When I hit a ball, I want someoneelse to go chase it. - Rogers.

How can we search?

For smaller polypeptides, exhaustive search can be used

Looking at the “best” fold, even in such a simple model, can teach us interesting things about the protein folding process

For larger chains, other optimization and search methods must be used

Greedy, branch and bound

Evolutionary computing, simulated annealing

Graph theoretical methods

Page 41: 1 Bioinformatics Algorithms Protein Structure © Jeff Parker, 2009 I don't want to play golf. When I hit a ball, I want someoneelse to go chase it. - Rogers.

Hydrophobic zipper

Ken Dill ~ 1997

Page 42: 1 Bioinformatics Algorithms Protein Structure © Jeff Parker, 2009 I don't want to play golf. When I hit a ball, I want someoneelse to go chase it. - Rogers.

Absolute directions

UURRDLDRRU

Relative directions

LFRFRRLLFFL

Advantage, we can’t have UD or RL in absolute

Only three directions: LRF

What about bumps? LFRRR

Bad score

Use a better representation

Representing a lattice model

Page 43: 1 Bioinformatics Algorithms Protein Structure © Jeff Parker, 2009 I don't want to play golf. When I hit a ball, I want someoneelse to go chase it. - Rogers.

Preference-order representation

Each position has two “preferences”

If it can’t have either of the two, it will take the “least favorite” path if possible

Example: {LR},{FL},{RL},{FR},{RL},{RL},{FR},{RF}

Can still cause bumps:{LF},{FR},{RL},{FL},{RL},{FL},{RF},{RL},{FL}

Page 44: 1 Bioinformatics Algorithms Protein Structure © Jeff Parker, 2009 I don't want to play golf. When I hit a ball, I want someoneelse to go chase it. - Rogers.

Extensions

Other lattices have been used

Decrease the scale of lattice, so atoms cannot fit on adjacent points

44

Page 45: 1 Bioinformatics Algorithms Protein Structure © Jeff Parker, 2009 I don't want to play golf. When I hit a ball, I want someoneelse to go chase it. - Rogers.

45

Mad Cow

Bovine Spongiform Encephalopathy (BSE) struck the UK in 1986. 170,000 cows affected. The brains of the dead “mad” cows resembled a sponge.

Similar to scrapie (sheep), Creutzfeld-Jacob Disease (humans). Dr. Prusiner identified the agent responsible for transmitting BSE as

“proteinaceous infectious particles”, which he named prions. Prions are proteins found in the nerve cells of all mammals. Abnormally-

shaped prions are found in BSE cows. It is thought that the infectious prions fold in unusual way. Stanley Prusiner pioneered the study of prions, and received Nobel Prize in

1997. The normal protein has a secondary structure dominated by alpha helices. The

abnormal version of the protein has the same primary structure, but its secondary structure is dominated by beta sheets.

Page 46: 1 Bioinformatics Algorithms Protein Structure © Jeff Parker, 2009 I don't want to play golf. When I hit a ball, I want someoneelse to go chase it. - Rogers.

46

Spread of Mad CowA person eats meat with an abnormally-shaped prion. The prion is absorbed into the bloodstream and crosses into the nervous

system. The abnormal prion touches a normal prion and changes the normal prion's

shape into an abnormal one, thereby destroying the normal prion's original function.

Both abnormal prions then contact and change the shapes of other normal prions in the nerve cell.

The nerve cell tries to get rid of the abnormal prions by clumping them together in small sacs. Because the nerve cells cannot digest the abnormal prions, they accumulate in the sacs that grow and engorge the nerve cell, which eventually dies.

When the cell dies, the abnormal prions are released to infect other cells. Large, sponge-like holes are left where many cells die.

Page 47: 1 Bioinformatics Algorithms Protein Structure © Jeff Parker, 2009 I don't want to play golf. When I hit a ball, I want someoneelse to go chase it. - Rogers.

47

Geometric Hashing

Docking Problem: will these two proteins bind together? If so, how?

One approach, Geometric Hashing, arose in Vision ResearchWe have a noisy image of the worldWe are looking for certain objects with known shape: door, flight recorderThe object may be in view, but may be partly hiddenThe points may be rotated, scaled, translated

Identify a set of key pointsProcess the image to obtain a set of candidate pointsTry to quickly map a subset of the key points onto points in the image

Page 48: 1 Bioinformatics Algorithms Protein Structure © Jeff Parker, 2009 I don't want to play golf. When I hit a ball, I want someoneelse to go chase it. - Rogers.

48

Scaling

We have a corpus of information

Page 49: 1 Bioinformatics Algorithms Protein Structure © Jeff Parker, 2009 I don't want to play golf. When I hit a ball, I want someoneelse to go chase it. - Rogers.

49

Problems

Page 50: 1 Bioinformatics Algorithms Protein Structure © Jeff Parker, 2009 I don't want to play golf. When I hit a ball, I want someoneelse to go chase it. - Rogers.

50

Problems

We have a corpus of information

Page 51: 1 Bioinformatics Algorithms Protein Structure © Jeff Parker, 2009 I don't want to play golf. When I hit a ball, I want someoneelse to go chase it. - Rogers.

51

Patch Dock

Assume we know the shape of two proteins Will they fit together?

Page 52: 1 Bioinformatics Algorithms Protein Structure © Jeff Parker, 2009 I don't want to play golf. When I hit a ball, I want someoneelse to go chase it. - Rogers.

52

Patchdock

Traverse the surface, looking for local reference points

Convex, concave, saddle

Split the surface into patches of nearly equal size

Merge small patches, split large ones

Pair up matches

Convex on A, concave on B

Score the matches

Page 53: 1 Bioinformatics Algorithms Protein Structure © Jeff Parker, 2009 I don't want to play golf. When I hit a ball, I want someoneelse to go chase it. - Rogers.

53

Summary

Protein folding is a rich area

I have not left you with any explicit computation

I hope I have left you with an overview of the area, and an interest in learning more