Top Banner
Regression Models for Drug Discovery Prof. Bennett Math Models of Data Science 1/30/06
67

Regression Models for Drug Discovery

Feb 11, 2022

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Regression Models for Drug Discovery

Regression Models for Drug Discovery

Prof. BennettMath Models of Data Science

1/30/06

Page 2: Regression Models for Drug Discovery

Announcement

Please bring laptops with matlabinstalled to future classes. If you need help with matlab go to help desk.

Do matlab tutorial if you don’t have matlab

Page 3: Regression Models for Drug Discovery

Outline

ReviewLeast Squares (LS) ModelUnconstrained OptimizationOptimality Condition for LSRidge Regression ModelsModel with BiasPossible Improvements

Page 4: Regression Models for Drug Discovery

The figure depicts a cartoon representation of the relationship between the continuum of chemical space (light blue) and the discrete areas of chemical space that are occupied by compounds with specific affinity for biological molecules. Examples of such molecules are those from major gene families (shown in brown, with specific gene families colour-coded as proteases (purple), lipophilic GPCRs (blue) and kinases (red)). The independent intersection of compounds with drug-like properties, that is those in a region of chemical space defined by the possession of absorption, distribution, metabolism and excretion properties consistent with orally administered drugs —ADME space — is shown in green.

stopher Lipinski & Andrew Hopkins, NATURE|VOL 432 | 16 DECEMBER 2004, pp.855-861

Page 5: Regression Models for Drug Discovery

Descriptors from Molecular Electronic

PropertiesO

H3C

NN

CH3

N

CH3

Page 6: Regression Models for Drug Discovery

MOE Descriptors® Chemical Computing Group Inc.

“2-D” Molecular Descriptors can be calculated from the connection table (with no dependence on conformation):

Physical PropertiesSubdivided Surface Area DescriptorsAtom Counts and Bond CountsConnectivity and Shape IndicesAdjacency and Distance Matrix DescriptorsPharmacophore Feature DescriptorsPartial Charge Descriptors

“3-D” Descriptors depend on molecular coordinates:

Potential Energy DescriptorsSurface Area, Volume and Shape DescriptorsConformation Dependent Charge Descriptors

• Sum of the atomic polarizabilities• Molecular mass density• Total charge of the molecule• Molecular refractivity • Molecular weight.• Log of the octanol/water partition

coefficient

•Number of aromatic atoms•Number of atoms•Number of heavy atoms•Number of hydrogen atoms •Number of boron atoms•Number of carbon atoms•Number of nitrogen atoms•Number of oxygen atoms•Number of fluorine atoms•Number of phosphorus atoms•Number of sulfur atoms•Number of chlorine atoms•Number of bromine atoms•Number of iodine atoms•Number of rotatable single bonds •Number of aromatic bonds •Number of bonds •Number of double bonds •Number of rotatable bonds •Fraction of rotatable bonds•Number of single bonds•Number of triple bonds•Number of chiral centers •Number of O and N atoms•Number of OH and NH groups •Number of rings

•Water accessible surface area of all atoms with positive partial charge •Water accessible surface area of all atoms with negative partial char•Water accessible surface area of all hydrophobic atoms•Water accessible surface area of all polar atoms •Positive charge weighted surface area•Negative charge weighted surface area

ge

•Water accessible surface area•Globularity•Principal moment of inertia•Radius of gyration•van der Waals surface area

•Angle bend potential energy•Electrostatic component of the potential energy•Out-of-plane potential energy•Solvation energy•Bond stretch potential energy•Local strain energy•Torsion potential energy

•Number of hydrogen bond acceptor atoms•Number of acidic atoms•Number of basic atoms•Number of hydrogen bond donor atoms•Number of hydrophobic atoms

•Total positive partial charge•Total negative partial charge•Total positive van der Waals surface area•Total negative van der Waals surface area•Fractional positive polar van der Waals surface area•Fractional negative polar van der Waals surface area

Page 7: Regression Models for Drug Discovery

Predict Drug Bioavailability

Aqua solubility = Aquasol525 descriptors generated

Electronic TAETraditional

197 molecules with tested solubility

y R∈

525i R∈x

197=

Page 8: Regression Models for Drug Discovery

Linear Regression

Given training data:

Construct linear function:

Goal for future data (x,y) with y unknown

( ) ( ) ( ) ( )( )1 1 2 2, , , , , , , , ,

points and labels i

ni

S y y y y

R y R

=

∈ ∈

i

i

x x x x

x

… …

1( ) , ' ( )

n

i ii

g w x=

= = = ∑x x w x w

( )g y≈x

Page 9: Regression Models for Drug Discovery

1-d Regression

,w x

y

x

Page 10: Regression Models for Drug Discovery

Least Squares Approximation

Want

Define error

Minimize loss

( )g x y≈

( , ) ( )f y y g ξ= − =x x

( )2

1( , ) ( )i i

iL g s y g

=

= −∑ x

Page 11: Regression Models for Drug Discovery

“Training” Model

Create function g(x)=x’w that works well on training data as defined by some loss function

For least squares loss this becomes2

1min ( ' )i ii

y=

−∑w x w

1min ( ' , )i ii

loss y=∑w x w

Page 12: Regression Models for Drug Discovery

Linear Algebra

Let data matrix X have a data point for each row

Response is vector y

1 2[ , , , ] '=X x x x…

1

2

yy

y

⎡ ⎤⎢ ⎥⎢ ⎥=⎢ ⎥⎢ ⎥⎣ ⎦

y

Page 13: Regression Models for Drug Discovery

Equivalent Forms of Loss

21

2

( ) ( ' )

2( ) '( )

2

i iiL y

norm=

= −

= − −

= − −= − +

∑w x w

Xw yXw y Xw y

y'y y'Xw w'X'Xw

Page 14: Regression Models for Drug Discovery

Just in time lesson in convex optimization

Convexity and Unconstrained Optimization

Page 15: Regression Models for Drug Discovery

WARNING

In Machine Learning Land :x is the datawe minimize loss with respect to the

variable w

In optimization, x is a variablewe minimize objective with respect to x

Page 16: Regression Models for Drug Discovery

If you don’t know where you are going, you probably won’t get there.

-from some book I read in eight grade

If you do get there, you won’t know it. -Dr. Bennett’s amendment

Mathematical Programming Theory tells us –How to formulate a model.Strategies for solving the model.How to know when we have found anoptimal solutions.How hard it is to solve the model.

Let’s start with the basics…………………

Page 17: Regression Models for Drug Discovery

Line Segment

Let x∈Rn and y∈Rn, the points on the line segment joining x and y are { z | z = λx+(1- λ)y, 0≤ λ ≤ 1 }.

y

x

Page 18: Regression Models for Drug Discovery

Convex Functions

A function f is (strictly) convex on a convex set S, if and only if for any x,y∈S, f(λx+(1- λ)y)(<) ≤ λ f(x)+ (1- λ)f(y)

for all 0≤ λ ≤ 1.

x y

f(y)

f(x)

x+(1- λ)y

f(x+(1- λ)y)

Page 19: Regression Models for Drug Discovery

Concave Functions

A function f is (strictly) concave on a convex set S, if and only if for any –f is (strictly) convex on S.

f -f

Page 20: Regression Models for Drug Discovery

(Strictly)Convex, Concave, or none of the above?

None of the above

Concave Convex

Concave Strictly convex

Page 21: Regression Models for Drug Discovery

Favorite Convex Functions

Linear functions

Certain Quadratic functions depends on choice of Q (the Hessian matrix)

1( ) '

nn

i ii

f x w x w x where x R=

= = ∈∑

1 2 1 2( , ) 2f x x x x= +

( ) ' 'f x x Qx w x c= + + 2 21 2 1 2( , ) 2f x x x x= +

Page 22: Regression Models for Drug Discovery

Convexity of function affects optimization algorithm

Page 23: Regression Models for Drug Discovery

Theorem 2.1: Global Solution of convex program

If x* is a local minimizer of a convex programming problem, x* is also a global minimizer. Further more if the objective is strictly convex then x* is the unique global minimizer.

Proof. Nash and Sofer

Page 24: Regression Models for Drug Discovery

Problems with nonconvexobjective

a x* bf strictly convex, problem has unique global minimum

Min f(x) subject to x ∈ [a,b]

x*f not convex, problem hastwo local minima a x’ b

Page 25: Regression Models for Drug Discovery

Multivariate Calculus

For x ∈Rn, f(x)=f(x1, x2 , x3 , x4 ,…, xn)The gradient of f:

The Hessian of f:

1 2

( ) ( ) ( )( ) , ,...,n

f x f x f xf xx x x

′⎛ ⎞∂ ∂ ∂∇ = ⎜ ⎟∂ ∂ ∂⎝ ⎠

2 2 2

1 1 1 2 12 2

22 1 2 2

2 2 2

1 2

( ) ( ) ( )...

( ) ( ) ...( )

( ) ( ) ( )...

n

n n n n

f x f x f xx x x x x x

f x f xf x x x x x

f x f x f xx x x x x x

⎡ ⎤∂ ∂ ∂⎢ ⎥∂ ∂ ∂ ∂ ∂ ∂⎢ ⎥⎢ ⎥∂ ∂⎢ ⎥

∇ = ∂ ∂ ∂ ∂⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥∂ ∂ ∂⎢ ⎥∂ ∂ ∂ ∂ ∂ ∂⎢ ⎥⎣ ⎦

Page 26: Regression Models for Drug Discovery

For example1

1

1

4 321 2 1 2

31 2

32 1

32

22

( ) 3 4

2 3 4( )

1 2 4

2 9 4( )

4 3 6

x

x

x

f x x e x xxx e x

f xx x

ef x

x

= + + +

⎡ ⎤+ +∇ = ⎢ ⎥

+⎢ ⎥⎣ ⎦⎡ ⎤+

∇ = ⎢ ⎥⎣ ⎦

2

[ 0 , 1 ]7

( )1 2

1 1 4( )

4 3 6

x

f x

f x

′=

⎡ ⎤∇ = ⎢ ⎥

⎣ ⎦⎡ ⎤

∇ = ⎢ ⎥⎣ ⎦

Page 27: Regression Models for Drug Discovery

Quadratic Functions

Form

Gradient

1 1 1

1( ) '2

12

n n n

i j i j j ji j j

f x x Q x b x

Q x x b x= = =

′= −

= −∑ ∑ ∑

2

( )( )

f x Qx bf x Q

∇ = −∇ =

1

( ) 1 12 2

assuming symmetric

kk k ik i kj j ki k j kk

n

kj j kj

f x Q x Q x Q x bx

Q x b Q

≠ ≠

=

∂= + + −

= −

∑ ∑

n nxn nx R Q R b R∈ ∈ ∈

Page 28: Regression Models for Drug Discovery

Taylor Series Expansion about x* - 1D Case

Let x=x*+p

Equivalently

2 2 3 3

n n

1 1 f(x)= f(x*+p)=f(x*)+pf (x*)+ p f (x*)+ p f (x*) 2 3!

1 + + p f (x*) +n!

… …

2 2 3 3

n n

1 1 f(x)=f(x*)+(x-x*)f (x*)+ (x-x*) f (x*)+ ( *) f (x*) 2 3!

1 + + ( *) f (x*) +n!

x x

x x

′ −

−… …

Page 29: Regression Models for Drug Discovery

Taylor Series ExampleLet f(x) = exp(-x), compute Taylor Series Expansion about x*=0

2 2 3 3

n n

2 3* * * n *

2 3n

1 1 f(x)=f(x*)+(x-x*)f (x*)+ (x-x*) f (x*)+ ( *) f (x*) 2 3!

1 + + ( *) f (x*) +n!

1 + +(-1)2 3! !

1 + +(-1)2 3! !

nx x x x

n

x x

x x

x x xxe e e en

x x xxn

− − − −

′ −

= − + − +

= − + − +

… …

… …

… …

Page 30: Regression Models for Drug Discovery

First Order Taylor Series Approximation

Let x=x*+p

Says that a linear approximation of a function works well locally

0

f(x)=f(x*+p)=f(x*)+p f(x*)+ p ( *, )lim ( *, ) 0p

x pwhere x p

αα

′∇

=

f(x)f(x)=f(x*+p)= ( *) ( *)f x p f x′+ ∇

x*

f(x)= ( *) ( *) ' ( *)f x x x f x+ − ∇

Page 31: Regression Models for Drug Discovery

Exercise

2

3 2 2 21 2 1 1 2 1 2

2 2

( , ) 5 7 2 fu n c t io n

( ) ( * ) [ , ] ' g ra d ie n t

( ) ( * ) H e s s ia n

x x x x x x x

f x f x

f x f x

= + + +

⎡ ⎤∇ =⎢ ⎥

⎣ ⎦⎡ ⎤ ⎡ ⎤

∇ =⎢ ⎥ ⎢ ⎥⎣ ⎦⎣ ⎦

f x

∇ =

∇ =

Page 32: Regression Models for Drug Discovery

Exercise 2

3 2 2 21 2 1 1 2 1 2

2 21 1 2 221 1 2 2

1 2 1 22 2

1 2 1

( , ) 5 7 2 function ( *) 56

3x 10 7( ) ( *) [15, 52] gradient

5 14 4

6 10 10 14 18 22( ) ( *) Hessian

10 14 14 4 22 -24

f x x x x x x x x f x

x x xf x f x

x x x x

x x x xf x f x

x x x

= + + + = −

⎡ ⎤+ +′∇ = ∇ = −⎢ ⎥

+ +⎢ ⎥⎣ ⎦+ +⎡ ⎤ ⎡ ⎤

∇ = ∇ =⎢ ⎥ ⎢ ⎥+ + ⎣ ⎦⎣ ⎦

Page 33: Regression Models for Drug Discovery

Convex Functions

A function f is (strictly) convex on a convex set S, if and only if for any x,y∈S, f(λx+(1- λ)y)(<) ≤ λ f(x)+ (1- λ)f(y)

for all 0≤ λ ≤ 1.

x y

f(y)

f(x)

x+(1- λ)y

f(x+(1- λ)y)

Page 34: Regression Models for Drug Discovery

Proving Function Convex

Linear functions

1( ) '

nn

i ii

f x w x w x where x R=

= = ∈∑

, (0,1)( (1 ) ) '( (1 ) )

' (1 ) ' ( ) (1 ) ( )

nFor any x y Rf x y w x y

w x w y f x f yλ λ λ λ

λ λ λ λ

∈ ∈+ − = + −

= + − ≤ + −

Page 35: Regression Models for Drug Discovery

Is least squares function convex?

Page 36: Regression Models for Drug Discovery

Theorem

Let f be twice continuously differentiable.f(x) is convex on S if and only if for all x∈X, the Hessian at x

is positive semi-definite.

2 ( )f x∇

Page 37: Regression Models for Drug Discovery

Definition

The matrix H is positive semi-definite (p.s.d.) if and only if for any vector y

The matrix H is positive definite (p.d.) if and only if for any nonzero vector y

Similarly for negative (semi-) definite.

0y Hy′ ≥

0y Hy′ >

Page 38: Regression Models for Drug Discovery

Theorem

Let f be twice continuously differentiable.f(x) is strictly convex on S if and only if for all x∈X, the Hessian at x

is positive definite.

2 ( )f x∇

Page 39: Regression Models for Drug Discovery

Checking Matrix H is p.s.d/p.d.

Manually

[ ] 1 2 21 2 1 2 1 1 2 2

2

2 21 1 2 2

2 21 2 1 2 1, 2

4 14 3

1 3

4 2 3

( ) ^ 2 3 2 0 [ ] 0so matrix is positive definite

xx x x x x x x x

x

x x x x

x x x x x x

− ⎡ ⎤⎡ ⎤= − − +⎢ ⎥⎢ ⎥−⎣ ⎦ ⎣ ⎦

= − +

= − + + > ∀ ≠

Page 40: Regression Models for Drug Discovery

Differentiability and Convexity

For convex function, linear approximation underestimates function

f(x)

( ) ( *) ( *) ( *)g x f x x x f x′= + − ∇

(x*,f(x*))

Page 41: Regression Models for Drug Discovery

Theorem

Assume f is continuously differentiable on a Set S.

F is convex on S if and only if

( ) ( ) ( ) ' ( ) ,f y f x y x f x x y S≥ + − ∇ ∀ ∈

Page 42: Regression Models for Drug Discovery

Theorem

Consider problem min f(x) unconstrained. If and f is convex, then

is a global minimum.Proof:

( ) 0f x∇ =x

( ) ( ) ( ) ' ( ) by convexity of( ) since ( ) 0.

yf y f x y x f x f

f x f x

∀≥ + − ∇= ∇ =

Page 43: Regression Models for Drug Discovery

Unconstrained Optimality Conditions

Basic Problem:

(1)

Where S is an open sete.g. Rn

min ( )x S

f x∈

Page 44: Regression Models for Drug Discovery

First Order NecessaryConditions

Theorem: Let f be continuously differentiable.

If x* is a local minimizer of (1),then

( * ) 0f x∇ =

Page 45: Regression Models for Drug Discovery

Stationary Points

Note that this condition is not sufficient

( * ) 0f x∇ =Also true for

local max and saddle points

Page 46: Regression Models for Drug Discovery

Proof

Assume false, e.g.,Let ( *), thend f x= −∇

( * ) ( *) ( *) ( *, )

( * ) ( *) ( *) ( *, )

( * ) ( *) 0 for sufficiently smallsince ( *) 0and ( *, ) 0.

!! * is a local min.

f x d f x d f x d x d

f x d f x d f x d x d

f x d f xd f x x d

CONTRADICTION x

λ λ λ α λ

λ α λλ

λ λα λ

′+ = + ∇ +

⇓+ − ′= ∇ +

⇓+ − <′∇ < →

( *) 0f x∇ ≠

Page 47: Regression Models for Drug Discovery

Example

Say we are minimizing 2 2

1 2 1 1 2 2 1 21( , ) 2 15 42

f x x x x x x x x= − + − −

[8,2]???

Page 48: Regression Models for Drug Discovery

Solve FONC

Solve FONC to find stationary point *1

121 2 1

22

111 2

12 2

2 15( , ) 0

44

2* 15 8* 4 24

xf x x

x

xx

−−

⎡ ⎤⎡ ⎤ ⎡ ⎤∇ = − =⎢ ⎥⎢ ⎥ ⎢ ⎥

⎣ ⎦⎢ ⎥⎣ ⎦⎣ ⎦

⎡ ⎤⎡ ⎤ ⎡ ⎤ ⎡ ⎤⇒ = =⎢ ⎥⎢ ⎥ ⎢ ⎥ ⎢ ⎥

⎣ ⎦ ⎣ ⎦⎢ ⎥⎣ ⎦ ⎣ ⎦

Page 49: Regression Models for Drug Discovery

Argument

The Hessian at every value x is

122

1 2 12

2( , )

4

which is p.d... Therefore the function is strictly convex.Since f(x*)=0 and f is a strictly convex, x* is the unique strict global minimum.

f x x−

⎡ ⎤∇ =⎢ ⎥

⎢ ⎥⎣ ⎦

Page 50: Regression Models for Drug Discovery

End of Convex Programming Mini-lesson

Important pointsFor unconstrained minimization:Want to minimize convex functionNecessary and Sufficient Condition for

optimality of a convex function is

( *) 0f x∇ =

Page 51: Regression Models for Drug Discovery

Least Squares Optimality condition

Find min with respect to w of

21

2

min ( ' )

min 2min ( ) '( )min 2

i iiy or

norm=

− −

− −− +

∑ x w

Xw yXw y Xw y

y'y y'Xw w'X'Xw

Page 52: Regression Models for Drug Discovery

Optimal Solution

Want:Mathematical Model:

Optimality Condition:

Solution satisfies:

( ) ( )2min ( , ) 'L S = − = − −w w y Xw y Xw y Xw

≈y Xw

( , ) 2 ' 2 ' 0L S∂= − + =

∂w X y X Xww

' '=X Xw X y

Solving n×n equation is 0(n3)

Page 53: Regression Models for Drug Discovery

Solution

Assume exists, then

Try it on Caco2!

( ) 1' −X X

( ) 1' ' ' '−= ⇒ =X Xw X y w X X X y

Is this a good assumption?

Page 54: Regression Models for Drug Discovery

Ridge Regression

Inverse typically does not exist.Use least norm solution for fixedRegularized problem

Optimality Condition:

2 2min ( , )L Sλ λ= + −w w w y Xw

( , )2 2 ' 2 ' 0

L Sλ λ∂

= − + =∂

ww X y X Xw

w

( )' 'nλ+ =X X I w X y

0.λ >

Requires 0(n3) operations

Page 55: Regression Models for Drug Discovery

Generalization

To estimate generalization error:Divide test into training set= Xtrain

100 points in Aquasoland test set = Xtest

97 points in AquasolCreate g(x) using XtrainEvaluate on Xtest

Page 56: Regression Models for Drug Discovery

Matlab

Matlab command:

Page 57: Regression Models for Drug Discovery

Train and Test for λ

0.5 1 1.5 2 2.5 3 3.5 4 4.5 5

0

2

4

6

8

10

12

14

16

x

trainerr(x)

0.5 1 1.5 2 2.5 3 3.5 4 4.5 51.5

1.55

1.6

1.65

1.7

1.75

1.8

1.85

1.9

1.95

x 104

x

testerr(x)

Page 58: Regression Models for Drug Discovery

1-d Regression with bias

, b+w x

x

y

b=2

Page 59: Regression Models for Drug Discovery

Optimal Solution

Want:Mathematical Model:

Optimality Conditions:

2min ( , , ) ( )L b S b= − +w w y Xw e

b is a vector of on≈ +y Xw e e

( , , ) 2 '( ) 0L b S b∂= − − =

∂w X y Xw e

w( , , ) 2 '( ) 0L b S e b

b∂

= − − =∂w y Xw e

es

Page 60: Regression Models for Drug Discovery

Optimal Solution

Thus :

Idea: Scale data so means are 0, e.g

' ' '' ' ( ) ( ) '

b

b mean mean

= −

⇒ = − = −

e e e y e Xwe y e Xw y X w

' ' ' b= −X Xw X y X e

' 0' 0==

e ye X

Page 61: Regression Models for Drug Discovery

Recenter Data

Shift y by mean

Shift x by mean

1

1 :i i ii

y y yµ µ=

= = −∑

1

1 :i i ii=

= = −∑x x x x x

Page 62: Regression Models for Drug Discovery

Ridge Regression with bias

Recenter X and y by Find least squares solution

How do you predict a new point?

x and µ

1( ) 'λ −= +w X'X I X y

( )f =x x'w

Page 63: Regression Models for Drug Discovery

To predict new point x

Must center and shift

Or equivalently( ) ( ) 'f µ= − +x x x w

( ) ' ( ' )f µ= + −x x w x w

Page 64: Regression Models for Drug Discovery

Main Points: Least Squares

Very nice optimization problem:ConvexClosed form solution

Regularization good forNiceness of optimization problem (conditioning)Generalization

Need to scale X and Y to account for biasQuality of model estimated by testing on out of sample set

Page 65: Regression Models for Drug Discovery

Limitations?

Will this model work on all drug discovery models?

Page 66: Regression Models for Drug Discovery

Next Class

Alternative losses and regularization based on 1-norm regularizationJust in time linear programming

Page 67: Regression Models for Drug Discovery

Nonlinear Regression