Protein-Protein Docking

Post on 12-Feb-2022

11 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

Transcript

Center for Bioinformatics, Saarbrücken, Germany

Celera Genomics, Rockville, MD, USA

Protein-Protein DockingBasics and New Applications

Oliver Kohlbacher

Hans-Peter Lenhof

Overview• Introduction

• Basic Algorithms + Scoring Functions

• Integration of Protein Flexibility– Flexibility in Proteins

– Algorithms for Semi-Flexible Docking

• Integration of Experimental Data– NMR Spectroscopy

– NMR-Based Protein Docking

• Outlook + Summary

Protein-Docking: Was, wie, warum?

Protein Docking: Introduction

Protein-Protein Docking

Protein Docking: Introduction

Protein-Protein Docking

Protein Docking: Introduction

Protein-Protein Docking

• Given two proteins A and B

• Predict complex structure AB

Protein Docking: Introduction

Protein-Protein Docking

• Uniform chemistry

• Understanding protein

interactions

• Speed-up structure

elucidation

• Predict protein

interactions

Protein Docking: Introduction

Protein-Ligand Docking

• Small, flexible Ligand

• Ligands are

chemically diverse

• Crucial for Drug

Design

• Virtual screening

Protein Docking: Introduction

Lock-and-Key Principle

Emil Fischer 1894“ To use an image, I would say that

enzyme and glycoside have to fit into each other like a lock and a key, in order to exert a chemical effect on each other.”

Protein Docking: Introduction

Lock-and-Key Principle

Protein Docking: Introduction

Lock-and-Key Principle

Protein Docking: Introduction

Lock-and-Key Principle

Protein Docking: Introduction

Lock-and-Key Principle

Protein Docking: Introduction

Lock-and-Key Principle

Protein Docking: Introduction

Lock-and-Key Principle

Protein Docking: Introduction

Lock-and-Key Principle

Scoring Functions for Protein Docking

Geometry

Chemistry

++- -

Lock-and-Key Principle

18Algorithms for Protein Docking: Basics

Protein Docking: How?

• Structure generation

19Algorithms for Protein Docking: Basics

Protein Docking: How?

• Structure generation

• Filtering

20Algorithms for Protein Docking: Basics

Protein Docking: How?

• Structure generation

• Filtering

21Algorithms for Protein Docking: Basics

Protein Docking: How?

• Structure generation

• Filtering

• Evaluation

22Algorithms for Protein Docking: Basics

Protein Docking: How?

< <

• Structure generation

• Filtering

• Evaluation

Algorithms for Protein Docking: Basics

Connolly SurfaceConnolly 1986 Bacon, Moult 1992Fischer et al. 1995Lin et al. 1994Norel et al. 1995Sandak et al. 1995Campbell et al. 1996Ackermann et al. 1995

Fuzzy LogicExtner, Brickmann 1997

Cube RepresentationJiang, Kim 1991

Graph RepresentationShoichet et al. 1992Shoichet, Kuntz 1996Kasinos et al. 1992

CorrelationKatchalski-Katzir et al.1992Vakser, Aflalo 1994Vakser 1996Gabb et al. 1997Meyer et al. 1996

Monte Carlo ApproachCherfils et al. 1991Totrov, Abagyan 1994

Slices RepresentationWalls, Sternberg 1992Helmer-Citterich et al. 1994Ausiello et al. 1997

Genetic AlgorithmsLevine et al. 1997

DatabaseEster et al. 1995

Overview of Docking Techniques

Algorithms for Protein Docking: Basics

Connolly SurfaceConnolly 1986 Bacon, Moult 1992Fischer et al. 1995Lin et al. 1994Norel et al. 1995Sandak et al. 1995Campbell et al. 1996Ackermann et al. 1995

Fuzzy LogicExtner, Brickmann 1997

Cube RepresentationJiang, Kim 1991

Graph RepresentationShoichet et al. 1992Shoichet, Kuntz 1996Kasinos et al. 1992

CorrelationKatchalski-Katzir et al.1992Vakser, Aflalo 1994Vakser 1996Gabb et al. 1997Meyer et al. 1996

Monte Carlo ApproachCherfils et al. 1991Totrov, Abagyan 1994

Slices RepresentationWalls, Sternberg 1992Helmer-Citterich et al. 1994Ausiello et al. 1997

Genetic AlgorithmsLevine et al. 1997

DatabaseEster et al. 1995

Overview of Docking Techniques

Algorithms for Protein Docking: Basics

What are the differences?

• Method for determining tentative structures– Based on surface complementarity

– Based on triangles

– ….

• Scoring functions employed– Contact surface area

– Electrostatics

– Hydrogen bonds

– ….

Algorithms for Protein Docking: Basics

Structure Generation• Assumption: Proteins are rigid bodies!• Three-dimensional “puzzle”• Six degrees of freedom

– Rotation– Translation

• Identify rigid transformations bringing B in contact with A

• Discretization– Of proteins, protein surface– Rotational/translational space

Algorithms for Protein Docking: Basics

Structure Generation

• Katchalski-Katzir et al., Proc. Natl. Acad.

Sci. USA, 1992

– Grid-based correlation of A and B

• Lenhof, RECOMB 97

– Mapping of triangles

Lenhof, RECOMB 1997

Hash Table

Structure Generation

Calculate “critical points” on surface of A

Calculate triangles define by critical points

Calculate triangles defined by atoms of B

Search for triangles with similar side lengths

Lenhof, RECOMB 1997

Structure Generation

Calculate “critical points” on surface of A

Calculate triangles define by critical points

Calculate triangles defined by atoms of B

Search for triangles with similar side lengths

Lenhof, RECOMB 1997

Structure Generation

Scoring Functions for Protein Docking

Scoring Functions

• What distinguishes the true complex structure from “ false positives”?

• Physical chemistry:Complex structure with the lowest binding free

energy is the one observed in nature.

• Caveat: relies on sufficiently complete sampling of conformation space

Scoring Functions for Protein Docking

Prediction of Free Energy

• Also hardest part in related problems:

– Ligand docking

– Protein structure prediction

• Many methods neglect

– entropic contributions

– solvent effects

Scoring Functions for Protein Docking

Scoring Functions

• Simplest functions: Geometry

– Key-and-lock principle

– Large contact areas are favorable

– Overlaps between A and B are unfavorable

• More sophisticated: “Chemistry”

– Models based on physicochemistry

– Compromise between complexity and accuracy

Scoring Functions for Protein Docking

Geometry

Chemistry

++- -

Scoring Function

Scoring Functions for Protein Docking

Geometry

Chemistry

++- -

Scoring Function

Scoring Functions for Protein Docking

Geometric Scoring Function

Scoring Functions for Protein Docking

Geometric Scoring Function

Determine distance between (a, b) (a ∈A, b∈B)|}0.4.),(75.2|),{ (|)( <<= badistbacontact AB

|}75.2.),(|),{ (|)( <= badistbaoverlap AB)()()(_ 21 ABABAB overlapkcontactkscoregeom ⋅−⋅=

Scoring Functions for Protein Docking

Geometry

Chemistry

++- -

Scoring Function

Scoring Functions for Protein Docking

Geometry

Chemistry

++- -

Scoring Function

Scoring Functions for Protein Docking

Geometry

Chemistry

++- -

Scoring Function

Scoring Functions for Protein Docking

Chemical Scoring

�∈

=contactba

btypeatypeMscorechem),(

)(),()(_ AB

Scoring Functions for Protein Docking

Geometry

Chemistry

++- -

Scoring Function

B

A

2PTC

2PTC AB

2PTC No. 1 RMSD = 1.94 A

Scoring Functions for Protein Docking

Results: Docking of 2PTC

score

RM

SD

]

Katchalski-Katzir et al., PNAS 1992

Basic Ideas• Protein on grid

Katchalski-Katzir et al., PNAS 1992

Basic Ideas

000outside

01δ > 0surface

0ρ < 0ρ* δ < 0inside

outsidesurface insideA B

• Protein on grid• Assign values

– ai,j,k =• 1 at the surface of A• ρ << 0 inside A• 0 outside A

– bi,j,k =• 1 at the surface of B• δ > 0 inside B• 0 outside B

Katchalski-Katzir et al., PNAS 1992

Basic Ideas• Protein on grid• Assign values

– ai,j,k =• 1 at the surface of A• ρ << 0 inside A• 0 outside

– bi,j,k =• 1 at the surface of B• δ > 0 inside B• 0 outside B

000outside

01δ > 0surface

0ρ < 0ρ* δ < 0inside

outsidesurface insideA B

Katchalski-Katzir et al., PNAS 1992

Calculate Correlation cα,β,γ

• Correlation of a and b identifies those translations where A and B are in contact:

� +++⋅=kji

kjikji bac,,

,,,,,, γβαγβα

• Repeat for all translation vectors (α, β, γ)

���� Run time O(N6)!

Katchalski-Katzir et al., PNAS 1992

From: Katchalski-Katzir et al., PNAS 1992, 2195

Cross Section cα=0,β,γ

Katchalski-Katzir et al., PNAS 1992

Algorithm

• Calculate ai,j,k and A* = [FFT(a)]*

• For all rotations of B:– Calculate bi,j,k and B = FFT(b)

– Calculate C = A * B

– Calculate ci,j,k = IFT(C)

– Identify tentative transformations (α, β, γ) as strongly positive peaks in c

From Rigid to Flexible Docking

3122114TPI

31225521TGS

1221802TGP

8166951TEC

5133812SNI

1317272136914SGB

3124924935182SEC

11221612PTC

8122492MHB

912291256722KAI

1012224HVP

5>125106104176373HFM

28252467922HFL

1233345529001243091FDL

51331064CPA

312221CHO

L97AHP+96MWS96NLW+95NLW+95NLW+95PDB ID

Comparison of AlgorithmsComparison of Algorithms

13122114TPI

131225521TGS

11221802TGP

18166951TEC

15133812SNI

11317272136914SGB

13124924935182SEC

111221612PTC

18122492MHB

1912291256722KAI

11012224HVP

15>125106104176373HFM

128252467922HFL

171233345529001243091FDL

151331064CPA

1312221CHO

L97AHP+96MWS96NLW+95NLW+95NLW+95PDB ID

Comparison of AlgorithmsComparison of Algorithms

13122114TPI

131225521TGS

11221802TGP

18166951TEC

15133812SNI

11317272136914SGB

13124924935182SEC

111221612PTC

18122492MHB

1912291256722KAI

11012224HVP

15>125106104176373HFM

128252467922HFL

171233345529001243091FDL

151331064CPA

1312221CHO

L97AHP+96MWS96NLW+95NLW+95NLW+95PDB ID

Comparison of AlgorithmsComparison of Algorithms

RMSD

2PTC_E 2PTC_I

Fit = GeomScore + ChemScore

One PS in the Life of PTI

RMSD

1TPO 4PTI

Fit = GeomScore + ChemScore

2PTC 1TPO 4PTI

2PTC 1TPO 4PTI

2PTC 1TPO 4PTI

From Rigid to Flexible Docking

Protein Flexibility• Domain movements

– Large scale movements of protein domains

– Mainly backbone movements

– Highly flexible “hinge” regions: hinge bending

• Side-chain flexibility– Rather rigid backbone

– Local rearrangements of side chains

– Flexibility based on side-chain torsions

From Rigid to Flexible Docking

Semi-Flexible Docking• Assumptions

– Rigid backbone

– Flexible side chains

• Basic Algorithm

– Start with rigid docking candidates

– Demangle overlapping side chains

– Calculate energy

Good approximation for Trypsin/BPTI!

Protein Docking with Flexible Side Chains

Protein Docking with Flexible Side Chains

• Apply the RBD algorithm

Protein Docking with Flexible Side Chains

• Apply the RBD algorithm

• Optimize the side chains in the docking site of the best RBD candidates

Protein Docking with Flexible Side Chains

• Apply the RBD algorithm

• Optimize the side chains in the docking site of the best RBD candidates

w.r.t. the potential energy (AMBER)

14.12.99

ARG 39

Protein Docking with Flexible Side Chains

• Apply the RBD algorithm

• Optimize the side chains in the docking site of the best RBD candidates

w.r.t. the potential energy (AMBER)

using a rotamer library

From Rigid to Flexible Docking

Torsion angle distributionLYS

From Rigid to Flexible Docking

SER

Rotamer LibrarySide-chain conformational space is adequately represented

by a discrete set of rotamers

(1) Determine all side chains in the docking site!

(1) Determine all side chains in the docking site!

S1

S2

(1) Determine all side chains in the docking site!

S1

S2

(1) Determine all side chains in the docking site!(2) Build sets of potential side chain orientation

S1 R1 R2

S2

(1) Determine all side chains in the docking site!(2) Build sets of potential side chain orientation

S1 R1 R2

S2 R1 R2

(1) Determine all side chains in the docking site!(2) Build sets of potential side chain orientation

S1 R1 R2

S2 R1 R2

(1) Determine all side chains in the docking site!(2) Build sets of potential side chain orientation(3) Determine the rotamer combination with minimal energy!

S1 R1 R2

S2 R1 R2

S1 R1 R2

S2 R1 R2

X11

X22X21

X12

S1 R1 R2

S2 R1 R2

X11

X22X21

X12

Xij ∈ { 0, 1}

S1 R1 R2

S2 R1 R2

X11

X22X21

X12 (X11 , X12 , X21 , X22)

Xij ∈ { 0, 1}

S1 R1 R2

S2 R1 R2

X11

X22X21

X12 (X11 , X12 , X21 , X22)( 0 , 1 , 1 , 0 )

Xij ∈ { 0, 1}

S1 R1 R2

S2 R1 R2

X11

X22X21

X12 (X11 , X12 , X21 , X22)( 0 , 1 , 1 , 0 )

Xij ∈ { 0, 1}

� Xij = 1 ∀ ij

S1 R1 R2

S2 R1 R2

X11

X22X21

X12 (X11 , X12 , X21 , X22)( 0 , 1 , 1 , 0 )

Xij ∈ { 0, 1}

� Xij = 1 ∀ i

min

j

S1 R1 R2

S2 R1 R2

X11

X22X21

X12 (X11 , X12 , X21 , X22)( 0 , 1 , 1 , 0 )

Xij ∈ { 0, 1}

� Xij = 1 ∀ i

min ET

j

S1 R1 R2

S2 R1 R2

X11

X22X21

X12 (X11 , X12 , X21 , X22)( 0 , 1 , 1 , 0 )

Xij ∈ { 0, 1}

� Xij = 1 ∀ i

min ET + � Xij Eijij

j

S1 R1 R2

S2 R1 R2

X11

X22X21

X12 (X11 , X12 , X21 , X22)( 0 , 1 , 1 , 0 )

Xij ∈ { 0, 1}

� Xij = 1 ∀ i

min ET + � Xij Eij + � � Xij Xkl Eijklij klij

k≠i

j

S1 R1 R2

S2 R1 R2

X11

X22X21

X12 (X11 , X12 , X21 , X22)( 0 , 1 , 1 , 0 )

Xij ∈ { 0, 1}

� Xij = 1 ∀ i

min ET + � Xij Eij + � � Xij Xkl (Eijkl - Emax)ij kl

k≠i

j

ij

S1 R1 R2

S2 R1 R2

X11

X22X21

X12 (X11 , X12 , X21 , X22)( 0 , 1 , 1 , 0 )

Xij ∈ { 0, 1}

� Xij = 1 ∀ i

Yijkl ≤ Xij , Yijkl ≤ Xkl Yijkl ∈ { 0, 1}

min ET + � Xij Eij + � � Yijkl (Eijkl - Emax)ij kl

k≠i

j

ij

Solve ILP

min ET + � Xij Eij + � � Yijkl (Eijkl - Emax)

Yijkl ≤ Xij , Yijkl ≤ Xkl Yijkl ∈ { 0, 1}

� Xij = 1 ∀ i Xij ∈ { 0, 1}

ij ij

j

kl

Solve ILP

min ET + � Xij Eij + � � Yijkl (Eijkl - Emax)

Yijkl ≤ Xij , Yijkl ≤ Xkl Yijkl ∈ { 0, 1}

� Xij = 1 ∀ i Xij ∈ { 0, 1}

ij ij

j

kl

Solve ILP

min ET + � Xij Eij + � � Yijkl (Eijkl - Emax)

Yijkl ≤ Xij , Yijkl ≤ Xkl Yijkl ∈ { 0, 1}

� Xij = 1 ∀ i Xij ∈ { 0, 1}

ij ij

j

kl

Solve ILP

min ET + � Xij Eij + � � Yijkl (Eijkl - Emax)

Yijkl ≤ Xij , Yijkl ≤ Xkl Yijkl ∈ { 0, 1}

� Xij = 1 ∀ i Xij ∈ { 0, 1}

ij ij

j

kl

Solve LP

Solve ILP

min ET + � Xij Eij + � � Yijkl (Eijkl - Emax)

Yijkl ≤ Xij , Yijkl ≤ Xkl Yijkl ∈ { 0, 1}

� Xij = 1 ∀ i Xij ∈ { 0, 1}

ij ij

j

kl

Solve ILP

Solve LP

min ET + � Xij Eij + � � Yijkl (Eijkl - Emax)

Yijkl ≤ Xij , Yijkl ≤ Xkl Yijkl ∈ [ 0, 1]

� Xij = 1 ∀ i Xij ∈ [ 0, 1]

ij ij

j

kl

Solve ILP

Solve LP

min ET + � Xij Eij + � � Yijkl (Eijkl - Emax)

Yijkl ≤ Xij , Yijkl ≤ Xkl Yijkl ∈ [ 0, 1]

� Xij = 1 ∀ i Xij ∈ [ 0, 1]

ij ij

j

kl

infeasible

Solve ILP

Solve LP

min ET + � Xij Eij + � � Yijkl (Eijkl - Emax)

Yijkl ≤ Xij , Yijkl ≤ Xkl Yijkl ∈ [ 0, 1]

� Xij = 1 ∀ i Xij ∈ [ 0, 1]

ij ij

j

kl

infeasible

Solve ILP

Solve LP

min ET + � Xij Eij + � � Yijkl (Eijkl - Emax)

Yijkl ≤ Xij , Yijkl ≤ Xkl Yijkl ∈ [ 0, 1]

� Xij = 1 ∀ i Xij ∈ [ 0, 1]

ij ij

j

kl

infeasible

Solve ILP

Solve LP

Search for a cutting plane

min ET + � Xij Eij + � � Yijkl (Eijkl - Emax)

Yijkl ≤ Xij , Yijkl ≤ Xkl Yijkl ∈ [ 0, 1]

� Xij = 1 ∀ i Xij ∈ [ 0, 1]

ij ij

j

kl

Search for a cutting plane

infeasible

Solve ILP

Solve LP

min ET + � Xij Eij + � � Yijkl (Eijkl - Emax)

Yijkl ≤ Xij , Yijkl ≤ Xkl Yijkl ∈ [ 0, 1]

� Xij = 1 ∀ i Xij ∈ [ 0, 1]

ij ij

j

kl

add it to the LP

non feasible

Solve ILP

Solve LP

Search for a cutting plane

min ET + � Xij Eij + � � Yijkl (Eijkl - Emax)

Yijkl ≤ Xij , Yijkl ≤ Xkl Yijkl ∈ [ 0, 1]

� Xij = 1 ∀ i Xij ∈ [ 0, 1]

ij ij

j

kl

infeasible

Solve ILP

Solve LP

add it to the LP

Search for a cutting plane

min ET + � Xij Eij + � � Yijkl (Eijkl - Emax)

Yijkl ≤ Xij , Yijkl ≤ Xkl Yijkl ∈ [ 0, 1]

� Xij = 1 ∀ i Xij ∈ [ 0, 1]

ij ij

j

kl

infeasible

Solve ILP

Solve LP

add it to the LP

Search for a cutting plane

min ET + � Xij Eij + � � Yijkl (Eijkl - Emax)

Yijkl ≤ Xij , Yijkl ≤ Xkl Yijkl ∈ [ 0, 1]

� Xij = 1 ∀ i Xij ∈ [ 0, 1]

ij ij

j

kl

feasible

non feasible

Solve ILP

Solve LP

add it to the LP

Search for a cutting plane

min ET + � Xij Eij + � � Yijkl (Eijkl - Emax)

Yijkl ≤ Xij , Yijkl ≤ Xkl Yijkl ∈ [ 0, 1]

� Xij = 1 ∀ i Xij ∈ [ 0, 1]

ij ij

j

kl

feasible

infeasible

Solve ILP

Solve LP

add it to the LP

Search for a cutting plane

min ET + � Xij Eij + � � Yijkl (Eijkl - Emax)

Yijkl ≤ Xij , Yijkl ≤ Xkl Yijkl ∈ { 0, 1}

� Xij = 1 ∀ i Xij ∈ { 0, 1}

ij ij

j

kl

feasible

infeasible

Solve ILP

Solve LP

add it to the LP

Search for a cutting plane

min ET + � Xij Eij + � � Yijkl (Eijkl - Emax)

Yijkl ≤ Xij , Yijkl ≤ Xkl Yijkl ∈ { 0, 1}

� Xij = 1 ∀ i Xij ∈ { 0, 1}

ij ij

j

kl

feasible

infeasible

Solve ILP

Solve LP

add it to the LP

Search for a cutting plane

Solve ILP xi j= 0

Solve ILP xi j= 1

0<Xij<1

min ET + � Xij Eij + � � Yijkl (Eijkl - Emax)

Yijkl ≤ Xij , Yijkl ≤ Xkl Yijkl ∈ { 0, 1}

� Xij = 1 ∀ i Xij ∈ { 0, 1}

ij ij

j

kl

feasible

infeasible

Solve ILP

Solve LP

add it to the LP

Search for a cutting plane

Solve ILP xi j= 0

Solve ILP xi j= 1

0<Xij<1

min ET + � Xij Eij + � � Yijkl (Eijkl - Emax)

Yijkl ≤ Xij , Yijkl ≤ Xkl Yijkl ∈ { 0, 1}

� Xij = 1 ∀ i Xij ∈ { 0, 1}

ij ij

j

kl

From Rigid to Flexible Docking

Test Case: Trypsin/BPTI

BPTI

Trypsin

first approximation: rank 3

(docking of native structures)

BALL: Design und Architektur

What’s the problem?

From Rigid to Flexible Docking

What’s the problem?

•LYS 15 moves on docking

•overlapping atoms

•rigid docking must fail

From Rigid to Flexible Docking

Semi-Flexible Protein Docking

• Structure generation

• Filtering

• Final energetic evaluation

From Rigid to Flexible Docking

Semi-Flexible Protein Docking

• Structure generation

• Filtering

• Final energetic evaluation

From Rigid to Flexible Docking

Semi-Flexible Protein Docking

• Structure generation

• Filtering

• Side-chain demangling

• Final energetic evaluation

From Rigid to Flexible Docking

Side-Chain Flexibility

• Bond distances and bond angles constant

• Torsion angles variable

From Rigid to Flexible Docking

Torsion Angle Space

• Instead of 10 - 20 atoms with 3 coordinates

• Up to 4 torsion angles

• Typical binding site:

– 50 amino acids, 600 atoms

– 200 instead of 1800 degrees of freedom

From Rigid to Flexible Docking

Torsion angle distribution

0 60 120 180 240 300 3600

200

400

600

800

1000

1200co

unts

χ1 [°]

From Rigid to Flexible Docking

Torsion angle distribution

0 60 120 180 240 300 3600

60

120

180

240

300

360χ 2 [°

]

χ1 [°]

Flexible Docking: Combinatorial Problem

Combinatorial Problem

• Identify set of rotamers with lowest energy

• Typical binding site: 50 side-chains

• Binding site defined via distance between A/B

• Number of possible combinations: ~1060

Flexible Docking: Combinatorial Problem

Scoring Function

• No sophisticated energetic evaluation feasible

• Simple and fast scoring function: AMBER

• Decomposition into three components

���<

++=i ij

pwji

i

tpli

tpltotal

srrEEEE ,

Flexible Docking: Combinatorial Problem

Earlier Approaches

• Leach (1994): A* algorithm for side-chain

optimization (ligand docking)

• Desmet et al. (1992): Dead End Elimination

– Simple inequality

– Iteratively applied

Flexible Docking: Combinatorial Problem

How to solve the problem?

• Multi Greedy method

– Fast and simple heuristic

– Suboptimal solutions

• ILP formulation

– Optimal solution

Flexible Docking: Multi Greedy Approach

Multi Greedy Methodroot node

Flexible Docking: Multi Greedy Approach

Multi Greedy Methodroot node

11 12

Flexible Docking: Multi Greedy Approach

Multi Greedy Method

11 12

Flexible Docking: Multi Greedy Approach

Multi Greedy Method

11 12

21 2122 2223 2324 24

Flexible Docking: Multi Greedy Approach

Multi Greedy Method

11 12

21 2122 2223 2324 24

Flexible Docking: Multi Greedy Approach

Multi Greedy Method

E1/1 E1/2

E1

���<

++=i ij

pwji

i

tpli

tpltotal

srrEEEE ,

tplEE111 =

pwtpltpl EEEE1111 2,1211/1 ++=

Flexible Docking: Multi Greedy Approach

Multi Greedy Method

E1

E1/1 E1/2 E1/3 E1/4 E2/1 E2/2 E2/3 E2/4

E2

Etpl

Flexible Docking: Multi Greedy Approach

Multi Greedy Method

Flexible Docking: Multi Greedy Approach

Multi Greedy Method

Flexible Docking: Multi Greedy Approach

Multi Greedy Method

Flexible Docking: Multi Greedy Approach

Multi Greedy Method

Flexible Docking: Multi Greedy Approach

Multi Greedy Method

Flexible Docking: Multi Greedy Approach

Multi Greedy Method

Flexible Docking: Multi Greedy Approach

Multi Greedy Method

Flexible Docking: Multi Greedy Approach

Multi Greedy Method

Flexible Docking: Multi Greedy Approach

Multi Greedy Approach

• Fast

• Correctly demangles side chains for test set

• May yield suboptimal solutions

• How to determine the quality of the solution?

• Optimal algorithm: based on polyhedral optimization, ILP formulation

ILP-Based Algorithm

Graph Problem

For each rotamer r of side-chain i create a node with weight tplii rr

EvE =)(ri

v

ILP-Based Algorithm

Graph Problem

V1 V2 Vk……..

For each rotamer r of side-chain i create a node with weight tplii rr

EvE =)(ri

v

ILP-Based Algorithm

Graph Problem

V1 V2 Vk……..

A k-partite graph with partitions V1…Vk is constructed

ILP-Based Algorithm

V1

Graph Problem

Pairwise interactions are represented by edges uv with weight

V2 Vk……..

pwji sr

EuvE ,)( =

ILP-Based Algorithm

V1

Graph Problem

Pairwise interactions are represented by edges uv with weight

V2 Vk……..

pwji sr

EuvE ,)( =

ILP-Based Algorithm

Graph Problem

V1 V2 Vk……..

ILP-Based Algorithm

Graph Problem

rotamer graphV1 V2 Vk……..

ILP-Based Algorithm

Graph Problem

V1 V2 Vk……..

���<

++=i ij

pwji

i

tpli

tpltotal

srrEEEE ,

ILP-Based Algorithm

Graph Problem

V1 V2 Vk……..

��∈∈

+=RGuvRGv

uvEvERGE )()()(

ILP-Based Algorithm

Graph Problem

V1 V2 Vk……..

• Binary decision variables for nodes and edges

– xv for each node v

– xuv for each edge uv

• Node is selected if xv = 1

• Edge is selected if xuv = 1

��∈∈

+=RGuvRGv

uvEvERGE )()()(

ILP-Based Algorithm

Graph Problem

V1 V2 Vk……..

��∈∈

+=Euv

uvVv

v uvExvExE )()(

• Binary decision variables for nodes and edges

– xv for each node v

– xuv for each edge uv

• Node is selected if xv = 1

• Edge is selected if xuv = 1

ILP-Based Algorithm

Graph Problem

V1 V2 Vk……..

��∈∈

+=Euv

uvVv

v uvExvExE )()(

ILP-Based Algorithm

Integer Linear Program

��∈∈

+=Euv

uvVv

v uvExvExE )()(

Determine xv, xuv that minimize E

ILP-Based Algorithm

ILP: Basic Constraint System

��

���

� + ��∈∈ Euv

uvVv

v uvExvEx )()(min

E uvxx

E uvxx

vuv

uuv

∈≤

∈≤

allfor

allfor

}...1{ allfor 1 kixiVv

v ∈=�∈

s.t.

ILP-Based Algorithm

��

���

� −+− ��∈∈

))(())((min maxmax EuvExEvExEuv

uvVv

v

ILP: Basic Constraint System

E uvxx

E uvxx

vuv

uuv

∈≤

∈≤

allfor

allfor

}...1{ allfor 1 kixiVv

v ∈=�∈

s.t.

ILP-Based Algorithm: Branch & Cut

Solve ILP

Branch & Cut

ILP-Based Algorithm: Branch & Cut

Branch & Cut

Solve ILP

ILP-Based Algorithm: Branch & Cut

Solve ILP

Branch & Cut

ILP-Based Algorithm: Branch & Cut

Solve LP

Solve ILP

Branch & Cut

ILP-Based Algorithm: Branch & Cut

Solve ILP

Solve LP

Branch & Cut

ILP-Based Algorithm: Branch & Cut

Solve ILP

Solve LP

Branch & Cut

ILP-Based Algorithm: Branch & Cut

infeasible

Solve ILP

Solve LP

Branch & Cut

ILP-Based Algorithm: Branch & Cut

infeasible

Solve ILP

Solve LP

Branch & Cut

ILP-Based Algorithm: Branch & Cut

infeasible

Solve ILP

Solve LP

Search for a cutting plane

Branch & Cut

ILP-Based Algorithm: Branch & Cut

Search for a cutting plane

infeasible

Solve ILP

Solve LP

Branch & Cut

ILP-Based Algorithm: Branch & Cut

add it to the LP

non feasible

Solve ILP

Solve LP

Search for a cutting plane

Branch & Cut

ILP-Based Algorithm: Branch & Cut

infeasible

Solve ILP

Solve LP

add it to the LP

Search for a cutting plane

Branch & Cut

ILP-Based Algorithm: Branch & Cut

infeasible

Solve ILP

Solve LP

add it to the LP

Search for a cutting plane

Branch & Cut

ILP-Based Algorithm: Branch & Cut

feasible

infeasible

Solve ILP

Solve LP

add it to the LP

Search for a cutting plane

Branch & Cut

ILP-Based Algorithm: Branch & Cut

feasible

infeasible

Solve ILP

Solve LP

add it to the LP

Search for a cutting plane

Branch & Cut

ILP-Based Algorithm: Branch & Cut

feasible

infeasible

Solve ILP

Solve LP

add it to the LP

Search for a cutting plane

Branch & Cut

ILP-Based Algorithm: Branch & Cut

feasible

infeasible

Solve ILP

Solve LP

add it to the LP

Search for a cutting plane

Solve ILP xi j= 0

Solve ILP xi j= 1

Branch & Cut

ILP-Based Algorithm: Branch & Cut

feasible

infeasible

Solve ILP

Solve LP

add it to the LP

Search for a cutting plane

Solve ILP xi j= 0

Solve ILP xi j= 1

Branch & Cut

Semi-Flexible Docking: Results

Energetic Evaluation

• Based on Jackson & Sternberg

• Energetic evaluation has to be extended:

– inclusion of internal energies

• AMBER torsion energies:

torscavelebind GGGG ∆+∆+∆=∆

ILP-Based Algorithm

Implementation

• Molecular DS, rotamers, energies: BALL

• Branch-&-Cut: LEDA and ABACUS

• LP solver: CPLEX and SOPLEX

ILP-Based Algorithm: Results

Docking Trypsin/BPTI - Results

Semi-Flexible Docking: Results

Running Times(UltraSparc II, 333 MHz, per candidate)

Avg. time [min]Stage

Multi Greedy ILPCalculation of energies 30Combinatorial problem 3 14Side-chain optimization 5Energetic evaluation 70Total 108 119

Semi-Flexible Docking: Results

Results I

example RMSD/Å

1TPO/4PTI 1.41SBC/2CI2 1.85CHA/2OVO 3.2

Semi-Flexible Docking: Results

Results IIBPTI (LYS 15)

Semi-Flexible Docking: Results

Results III

-200 -100 0 100 200 300 4000

10

20

30

40

50

60

RM

SD

[10-1

0 m]

∆Gbind

[kJ/mol]

Trypsin/BPTI

Holm&Sander 1992 Monte-CarloBruccoleri&Novotny 1992 Exhaustive SearchDesmet et al. 1992 Dead-End-EliminationTotrov&Abagyan 1994a Monte-CarloTotrov&Abagyan 1994b Monte-CarloLaughton 1994 Local Homology ModelingLeach 1994 A*-AlgorithmWeng et al. 1996 Exhaustive SearchLeach&Lemon 1998 A*-Algorithm

182

Integration of Experimental Data

• Main problem in Docking: Energy Function

• Avoid the energy calculation!

• Include experimental data

– Containing geometric information

– Derived from simple and inexpensive experiment

� Nuclear Magnetic Resonance (NMR)

NMR-Based Protein Docking

Overview

• What is NMR spectroscopy?

• Defining a NMR-based scoring function– Predicting chemical shifts

– Synthesizing spectra

– Comparing NMR spectra

• Results

184NMR - Basics

NMR - Basics• 1H nuclei possess spin angular momentum

185NMR - Basics

NMR - Basics• 1H nuclei possess spin angular momentum

• In an external magnetic field B0, each nucleus can assume one of two spin states: α or β

α

β

0B E∆

186NMR - Basics

NMR - Basics• 1H nuclei possess spin angular momentum

• In an external magnetic field B0, each nucleus can assume one of two spin states: α or β

• Supplying energy ∆E enables transition

α

β

E∆ν⋅h

187NMR - Basics

NMR - Basics• 1H nuclei possess spin angular momentum

• In an external magnetic field B0, each nucleus can assume one of two spin states: α or β

• Supplying energy ∆E enables transition

α

β

E∆

188NMR - Basics

NMR - Basics• ∆E depends on

– magnitude of external field

– electronic structure surrounding the nucleus

δ

NMR - Basics

NMR: The Hardware

190NMR - Basics

1D 1H-NMR spectra

191NMR - Basics

Structural Information in Spectra

• Chemical shift in proteins depends on– Topology (chemical environment)

– Geometry (conformation)

• Certain experiments can– identify which proton causes which peak

(assignment)

– yield distance information (e.g., NOE constraints)

192NMR-Based Protein Docking

NMR-based Protein Docking

∆ ∆ ∆ ∆

Experiment

Docking

193NMR-Based Protein Docking

NMR-based Protein Docking

∆ ∆ ∆ ∆

Experiment

Docking

< < <

194NMR-Based Protein Docking

RCδδ =

Random coil shift

195NMR-Based Protein Docking

RCδδ =

+

E�

ESRC δδδ +=

Electrostatic field

196NMR-Based Protein Docking

ESRC δδδ +=

Ring current effect

197NMR-Based Protein Docking

ESRC δδδ +=

RSB�

RSESRC δδδδ ++=

Ring current effect

198NMR-Based Protein Docking

RSESRC δδδδ ++=Magnetic Anisotropy

AnisoB�

AnisoRSESRC δδδδδ +++=

199NMR-Based Protein Docking

Spectrum Synthesis/Comparison

• Removal of rapidly exchanging protons

• Lorentzian peaks of equal width

�−+

=i

Wi

S2

)(1

1)(δδ

δ

• Spectrum comparison via difference area ∆

−=−∆ δδδ dSSSS expcalcexp.calc |)()(|)( ...

NMR-Based Protein Docking

Synthesized Spectra

NMR-Based Protein Docking

Results – Test Set

• Bound complex structures (PDB)– 1DT7 A/B

– 1DT7 A/X

– 1CFF

– 1CKK

• NMR assignments (BMRB)– Reconstruction of exp. spectra

NMR-Based Protein Docking

Results - Overview

121DT7 A/B

121CKK

241CFF

101DT7 A/X

Rank of 1st true positive

# false positive among top 10Complex

NMR-Based Protein Docking

1DT7 A/X (NMR-based)

NMR-Based Protein Docking

1DT7 A/X (ACE)

NMR-Based Protein Docking

206NMR-Based Protein Docking

Open Problems

• Improving the shift model

– H-bonds

– Solvent effects

• Modeling peak width, couplings?

• Blind predictions of protein complexes

• Validation with larger data set and exp. data

Summary

• Protein Docking is an interesting problem!

• Key difficulties

– Energetic evaluation

– Protein flexibility

• Integration of experimental evidence

improves scoring

Scoring Functions for Protein Docking

Free Energy Functions

• Contact-based

– Atomic contact potential

– Residue contact potentials

• Molecular Mechanics

• Surface-based potentials

• …..

Scoring Functions for Protein Docking

Contributions to Free Energy I

• Hydrogen bonds

• Salt bridges

• Entropic contributions

– Solvent effects

– Loss of side-chain entropy

– Loss of DOF on binding

Scoring Functions for Protein Docking

Contributions to Free Energy II

• Electrostatic interactions

– Coulomb-based

– Continuum models (solvent effects)

• Hydrophobic interactions

• Van-der-Waals interactions

• ….

Scoring Functions for Protein Docking

Energetic EvaluationJackson, Sternberg (1994):

– continuum electrostatics

• solution of the Poisson-Boltzmann equation

• includes electrostatic solvent effects

– cavitation free energy

• measure for the hydrophobic interaction

vdWconfcavelebind GGGGG ∆+∆+∆+∆=∆ cavelebind GGG ∆+∆=∆

Scoring Functions for Protein Docking

Energetic Evaluation

Jackson, Sternberg (1994):

cavelebind GGG ∆+∆=∆

Scoring Functions for Protein Docking

Energetic Evaluation

Jackson, Sternberg (1994):

cavelebind GGG ∆+∆=∆BA

intBsol

Asolele GGGG −∆+∆∆+∆∆=∆

BALL: Design und Architektur

Poisson Boltzmann Equation

• Spatially dependent dielectric constant ε(r)– ~2-4 inside protein– 78 outside (water)

• Charge distribution ρ � potential φ• Solve differential equation on grid• Yields total electrostatic free energy ∆GES

( )0

2 )()(sinh)()()(

ερϕκϕε r

Tk

rerrr

0 ����� −=��

��

�−∇∇

Scoring Functions for Protein Docking

Continuum Electrostatics

AsolG∆∆

BsolG∆∆

BAintG −∆

BAintG −∆

Scoring Functions for Protein Docking

Energetic Evaluation

Jackson, Sternberg (1994):

cavelebind GGG ∆+∆=∆BA

intBsol

Asolele GGGG −∆+∆∆+∆∆=∆

)( BAABMSMSMScav AAAAG −−=∆=∆ γγ

From Rigid to Flexible Docking

Docking of Unbound Structures

• All previous examples are “easy”– structures A and B derived from true complex

structure (bound structures)

– Geometric complementarity guaranteed!

• More interesting problem:– Use unbound structures

– Problem: few test sets - A, B, and AB have to be known!

Protein Docking: Introduction

Why Docking?

Helmer-Citterich, Tramontano, J. Mol. Biol. 1994

Decomposition of A/B into Disks

• Calculate protein surface• Decompose into disks along z-axis• Decompose disk into segments of equal area• Differential surface: (r2 – r1), (r3 – r2),…

Fig. after Helmer-Citterich, Tramontano, JMB 1994

r1 - rn….r6 – r5r5 – r 4r4 – r3r3 – r2r2 – r1

Helmer-Citterich, Tramontano, J. Mol. Biol. 1994

Matching of Disc Segments

• Try to match windows, i.e. stretches of successive radii in a disk, from A and B

• Matching windows define tentative transformations

• Aggregate overlapping windows to regions

Helmer-Citterich, Tramontano, J. Mol. Biol. 1994

Horizontal Windows GrowingA

B

Helmer-Citterich, Tramontano, J. Mol. Biol. 1994

Overall Algorithm• Determine surface of A and B

• Decompose A into disks along z-axis

• For all rotations of B:– Decompose B into disks

– Find matching windows on differential surface

– Grow windows horizontally

– Grow windows vertically (→ regions)

– Each region defines a transformation

top related