17 Chapter 2: Modeling Enzyme Kinetics From a mathematical point of view, the art of good modeling relies on: (i) a sound understanding and appreciation of the biological problem; (ii) a realistic mathematical representation of the important biological phenomena; (iii) finding useful solutions, preferably quantitative; and most crucially important, (iv) a biological interpretation of the mathematical results in terms of insights and predictions. The mathematics is dictated by the biology and not vice versa. Sometimes the mathematics can be very simple. Useful mathematical biology research is not judged by mathematical standards but by different and no less demanding ones. - Jim Murray, 1993 Introduction When investigating a novel method, it is often very useful to use a small example, or “toy problem,” to examine its workings before jumping into a larger problem. As mentioned previously, these simplified problems are hard to come by in biology but there is a toy problem at the heart of the simulation, namely, enzymatic reactions. This chapter contains a description of the basic enzyme reaction first described by Michaelis and Menten in 1913, as well as a comparison between the results of the deterministic solution and the stochastic solution. This problem was chosen for a variety of reasons. Firstly, this problem contains much of the basics of enzymatic biology in its midst. Secondly,
36
Embed
Chapter 2: Modeling Enzyme Kinetics - CaltechTHESISthesis.library.caltech.edu/3907/4/Kastner_Chapter2.pdf · 17 Chapter 2: Modeling Enzyme Kinetics From a mathematical point of view,
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
17
Chapter 2: Modeling Enzyme Kinetics
From a mathematical point of view, the art of good modeling relies on: (i) a sound
understanding and appreciation of the biological problem; (ii) a realistic
mathematical representation of the important biological phenomena; (iii) finding
useful solutions, preferably quantitative; and most crucially important, (iv) a
biological interpretation of the mathematical results in terms of insights and
predictions. The mathematics is dictated by the biology and not vice versa.
Sometimes the mathematics can be very simple. Useful mathematical biology
research is not judged by mathematical standards but by different and no less
demanding ones.
- Jim Murray, 1993
Introduction
When investigating a novel method, it is often very useful to use a small example,
or “toy problem,” to examine its workings before jumping into a larger problem. As
mentioned previously, these simplified problems are hard to come by in biology but there
is a toy problem at the heart of the simulation, namely, enzymatic reactions. This chapter
contains a description of the basic enzyme reaction first described by Michaelis and
Menten in 1913, as well as a comparison between the results of the deterministic solution
and the stochastic solution. This problem was chosen for a variety of reasons. Firstly,
this problem contains much of the basics of enzymatic biology in its midst. Secondly,
18
this is one of the few problems for which a detailed solution can be constructed. Finally,
it affords a readily understandable introduction to the stochastic simulation.
In the case of the deterministic solution, perturbation theory will be used to
provide an approximate solution, but because any modern computer algebra system will
be able to easily provide a numerical solution to the resulting differential equations, that
will be included as well. To solve this problem using a stochastic solution, a short
program using the Mathematica programming language has been developed. This will
allow a detailed example of the stochastic simulation algorithm. By comparing the
solutions from the deterministic methods to the stochastic simulation solutions, it will be
shown that they are good agreement with each other in a global sense. However, it will
also be shown that there are situations in which the deterministic solution may not
capture the true state of the system.
Deterministic Solution
The theory for chemical kinetics in a large volume is well grounded in
experiments. Early forms of the Law of Mass Action, which states that the rate of the
reaction is proportional to the concentration of the reactants, appeared at least as early as
1802 with Berthollet’s nearly correct formulations. The final correct formulation came
from extensive experiments that Waage and Gulberg published in 1864 (Waage, 1986).
But an important piece of the puzzle was still missing, and it was another 25 years before
the discovery of the process that allowed some molecules to react while others remained
inactive.
19
In1889, while investigating an offshoot of his work on ionic solutions (work that
would eventually win him a Nobel Prize), Svante Arrhenius studied the effects
temperature has on the rate of a reaction. His data led him to conclude that in a reaction
system only a certain number of molecules are able to react at any given time. He
proposed that some sort of chemical catalyst must have activated the molecules that are
able to react. His theory said that the catalyst (C) would first form an intermediate
compound (CS) with the substrate (S), and the resulting compounds are then able to enter
a transition state that lowers the amount of energy that is needed to perform a chemical
reaction. The compound then decomposes into a product (P) and the catalyst (C), and the
catalyst is then free to participate in another reaction:
C + S ⇒ CS,CS ⇒ P + C (2.1)
Thus, the notion of activation energy for a chemical reaction was born (Teich, 1992).
Two decades later, Michaelis and Menten published a seminal piece of work on
how this type of system behaves. In their paper, they focused on a biological system that
has come to be known as the basic enzyme reaction (Michaelis and Menten, 1913). It
was very similar to Arrhenius system but with the addition of a backwards reaction
(disassociation) of the complex (ES). There was also a terminology change from catalyst
(C) to enzyme (E). This was just a minor change as an enzyme is defined to be an
organic catalyst. Schematically, this can be represented by
E + S⇔k−1
k1 ES, ES⇒k 2 P + E . (2.2)
In words, one molecule of the enzyme combines with one molecule of the substrate to
form one molecule of the complex. The complex can disassociate into one molecule of
20
each of the enzyme and substrate, or it can produce a product and a recycled enzyme. In
this formulation k1 is the rate parameter for the forward substrate/enzyme (catalyst), k−1
is the rate parameter for the backwards reactions, and k2 is the rate parameter for the
creation of the product. There is no backwards reaction forming the complex from the
product and the enzyme, as it is assumed that this reaction is energetically unfavorable
and the enzyme is much more likely to participate with the substrate in the formation of
the complex. Given an initial amount (or concentrations) of the reactants and the rate
parameters, the question is to determine the amount of product at some later time.
Using the Law of Mass Action, it is possible to write down the change in the
amount of each of the reactants, leading to one differential equation for each of the
reactants. The fact that it may not adequately capture the true state of small systems is a
problem that will be addressed shortly. The presentation of the basic enzyme reaction
that follows draws from the conventional approaches (Edelstein-Keshet, 1998; Murray,
1993).
Denoting the concentrations in (2.2) by
e = E[ ], s = S[ ], c = ES[ ], p = P[ ] (2.3)
the Law of Mass Action applied to this system leads to the following four differential
equations that describe the kinetics of the basic enzyme reaction:
dsdt
= −k1es + k−1c,dedt
= −k1es + (k−1 + k2)c
dcdt
= k1es − (k−1 + k2)c,dpdt
= k2c. (2.4)
As the system starts with only the substrate and enzymes, the initial conditions are then
21
e 0( ) = e0, s 0( ) = s0, c 0( ) = 0, p 0( ) = 0 . (2.5)
Before solving this system, it is important to note that the equations are not all
independent. First of all, given a fixed amount of enzyme, it is possible to write down a
conservation law by noting that the amount of free enzyme and bound enzyme must be
constant:
e(t)+ c( t) = e0 . (2.6)
Combining this back into the first three differential equations, it is possible to eliminate
one to end up with
dsdt
= −k1e0s + (k1s + k−1c),dcdt
= k1e0s− (k1s + k−1 + k2 )c (2.7)
with the initial conditions
s(0) = s0, c(0) = 0 . (2.8)
Finally, the equation for the product can be uncoupled from the others, and integration
leads to
p( t) = k2 c(u)du0
t∫ , (2.9)
which provides the solution for the product once the solution for the complex is known.
The end result is a reduction of the set of four differential equations into two coupled
ones.
As the situation under consideration is one where there are a small number of
enzymes compared to the number of substrate molecules available, let
ε =e0s0
, (2.10)
22
which leads to using the following variables to nondimensionalize the equations
τ = k1e0t, u(τ ) =s( t)s0
, v(τ) =c(t)e0
, λ =k2
k1s0, K =
k−1 + k2
k1s0. (2.11)
Then the system in (2.7) becomes
dudτ
= −u + (u + K − λ)v, εdvdτ
= u − (u + K)v , (2.12)
with the initial conditions
u(0) =1, v(0) = 0 . (2.13)
In looking for a solution for this problem, the appearance of the small parameter ε
in front of a derivative in (2.12) suggests that this is a singular perturbation problem, and
looking for a single regular Taylor series expansion solution in terms of the variables u,v
and ε will not be fruitful. Because of this, it is necessary to create a multiscale solution
from matching inner and outer solutions. This can be accomplished by first looking for
the regular Taylor expansion solution in the form
u τ;ε( ) = εnunn= 0∑ τ( ), v τ;ε( ) = εnvn
n= 0∑ τ( ) . (2.14)
Substituting this into (2.12) and equating like powers of ε yields for the O 1( ) system
In practice it is extremely difficult, if not impossible, to construct even
approximate solutions to a system that contains any more reactions than the Michaelis-
Menten problem and numerical methods must be used (McQuarrie, 1967).
As shown above, the Law of Mass Action applied to the basic enzyme reaction
leads to a set of coupled differential equations that can be approximated using
perturbation theory, and the differential equations are easily solved numerically as well.
Because the Law of Mass Action is not only well grounded in experiments but also leads
to equations that can be readily solved. But while differential equations are a natural way
25
to model chemical reactions in a vat, they might not adequately represent the true state of
the system in a cell.
Implicit in using the Law of Mass Action are two key assumptions that should be
mentioned: continuity and determinism. With regards to the continuity assumption, it is
important to note that the individual genes are often only present in one or two copies per
cell. Therefore, there are only one or two regulatory regions to which the regulatory
molecules can bind. In addition, the regulatory molecules that bind to these regions are
typically produced in low quantities: there may be only a few tens of molecules of a
transcription factor in the cell nucleus. This has been shown explicitly in bacterial cells,
but there is ample evidence supporting this fact in eukaryotic cells as well (Davidson,
1986; Guptasarma, 1995). The low number of molecules may compromise the notion of
continuity.
As for determinism, the rates of some of these reactions are so slow that many
minutes may pass before, for instance, the start of mRNA transcription after the
necessary molecules are present, or between the start and finish of mRNA creation
(Davidson, 1986). This may call into question the notion of the deterministic change
presupposed by the use of the differential operator due to the fluctuations in the timing of
cellular events. As a consequence, two regulatory systems having the same initial
conditions might ultimately settle into different states, a phenomenon strengthened by the
small numbers of molecules involved.
There have been some recent experimental results that strongly suggest that cells
do in fact behave stochastically. A review can be found in a recent article by the pioneers
26
of modeling stochastic processes in biology, and they drive home the point that
regulatory molecules are present in very low concentrations in cells, with a few hundred
being an upper limit, and dozens being a normal phenomenon (McAdams and Arkin,
1999). A study of these systems has shown that the stochastic fluctuations in such a
system can produce erratic distributions in protein levels between the same type of cell in
a population (McAdams and Arkin, 1997). This is especially true when the molecule
under investigation is part of the regulatory mechanism of the cell (Arkin et al., 1998).
Most recently, a study in yeast has produced intriguing data concerning the noise in a
biological system due to the intrinsic fluctuations (Elowitz et al., 2002).
When the fluctuations in the system are small, it is possible to use a reaction rate
equation approach. But when fluctuations are not negligibly small, the reaction rate
equations will give results that are at best misleading (showing only the mean behavior),
and possibly very wrong if the fluctuations can give rise to important effects. The real
problem arises in that it is not always known beforehand whether fluctuations are
important. The only way to find out is to use a stochastic simulation: If several
stochastic trajectories give results that appear to be identical, then reaction rate equations
could indeed have been used. But if the differences in the trajectories were noticeable,
then reaction rate equations probably would not have been appropriate. It is possible to
forge ahead, and the result is usually a mathematical model that describes the
phenomena, but fails to capture the fluctuations present in the system.
Some of the concerns about fluctuations in a system have been around for a long
time, if only in theory. With regards to the number of molecules in a cell, this was first
mentioned in the English literature by the biochemist J. B. S. Haldane when he
27
mentioned that critical processes might be carried out by one of a few enzymes per cell
(Haldane, 1930). Fifteen years later, this was repeated as a known fact in Nature
(McIlwain, 1946). More recently there appeared a paper on the question of whether the
laws of chemistry apply to living cells (Halling, 1989). It isn’t quite as elegant as
Purcell’s paper on life at low Reynolds numbers (Purcell, 1977), but like this famous talk,
the paper points out that it is a very different world inside a cell.
Consequently, the fluctuations in the system may actually be an important part of
the system. With these concerns in mind, it seems only natural to investigate an approach
that incorporates the small volumes and small number of molecular species (and the
inherent fluctuations that are present in a system) and may actually play an important
part. These investigations are still relatively new, but in recent years the stochastic
simulation algorithm has been used to model phage λ infected E. coli cells (Arkin et al.,
1998), and calcium wave propagation in rat hepatocytes (Gracheva et al., 2001).
Stochastic Solution
The first mention of using stochastic methods to model chemical reactions
appeared in 1940 (Delbrück, 1940; Kramers, 1940). But it wasn’t until the early 1950s
that it became clear that in small systems the Law of Mass Action breaks down (Rényi,
1954) and even small fluctuations in the number of molecules may be a significant factor
in the behavior of the system (Singer, 1953). Soon after, it became evident that some
processes in biological cells fell into this category and that a proper mathematical
formulation of the chemical reactions in the cells will most likely be based on stochastics
(Bartholomay, 1958).
28
The stochastic approach considers the sets of possible reactions and examines the
possible transitions of the system. As an example, consider the following irreversible
unimolecular reaction
A →k B , (2.30)
which is common in radioactive decay processes. In words, the molecule A is converted
to B with rate parameter k. The stochastic description of the system is characterized in
the following manner. Let X t( ) be a random variable that denotes the number of A
molecules at time t. Then
1) The probability of a transition from x +1( ) molecules to x( ) molecules in the
interval t,t + ∆t( ) is k x +1( )∆t + o ∆t( ). k is the rate constant and o ∆t( ) takes the
usual meaning that o ∆t( ) ∆t→ 0 as ∆t→ 0 .
2) The probability of a transition from x( ) to x − j( ), j >1 in the interval t,t + ∆t( ) is
o ∆t( ) .
3) The probability of a transition from x( ) to x + j( ), j ≥1 in the interval t,t + ∆t( ) is
zero.
Denoting the probability of X t( ) = x by Px t( ) , a balance of the terms yields
Px t + ∆t( ) = k x +1( )∆tPx+1 t( ) + 1− kx∆t( )Px t( ) + o ∆t( ) . (2.31)
Simplifying and taking the limit ∆t→ 0 yields the differential-difference equation
dPx t( ) dt = k x +1( )Px+1 t( ) − kPx t( ) , (2.32)
which is also called the chemical master equation for the system.
The solution of the chemical master equation can be thought of as a Markovian
random walk in the space of the reacting variables. It measures the probability of finding
29
the system in a particular state at any given time, and it can be rigorously derived from a
microphysical standpoint (Gillespie, 1992). Analytic solutions of master equations are
difficult to come by, but in this example it is possible to transform the differential-
difference equation into a partial differential equation through the use of the probability
generating function
F(s,t) = Pxx= 0
∞
∑ t( )sx . (2.33)
Substituting (2.33) into (2.32) and simplifying leads to
∂F∂t
= k 1− s( )∂F∂s
. (2.34)
Given the initial condition F(s,0) = sx0 the solution is then
F s,t( ) = 1+ s−1( )e−kt[ ]x 0 . (2.35)
Recall that if X t( ) is a random variable, then E X t( )[ ] , the expected value, is defined as
xPt x( )∑ which is, conveniently enough, ∂Fds s=1
. Computing this value leads to
E X t( ){ } = x0e−kt , (2.36)
which is the solution of the Mass Action formulation for the system:
dAdt
= −kA . (2.37)
Thus, the two representations are consistent. However, this is only true in general for
unimolecular reactions (McQuarrie, 1967).
Historically, numerical methods were used to construct solutions to the master
equations, but the solutions constructed in this manner have some pitfalls. These include
30
the need to approximate higher-order moments as a product of lower moments, and
convergence issues (McQuarrie, 1967). What was needed was a general method that
would solve these sorts of problems and this came with the stochastic simulation
algorithm.
Stochastic Simulation Algorithm
Given a set of molecular species Sµ{ }µ =1
N and a set of reactions in which they can
participate Rµ{ }µ =1
N, the Gillespie algorithm, as it has come to be known, is an exact
method for numerically computing the time evolution of a chemical system. By exact it
is meant that the results are provably equivalent to the chemical master equation, but at
no time is it necessary for the master equation to be written down, much less solved.
The fundamental hypothesis of the method is that the reaction parameter cµ
associated with the reaction Rµ can be defined in the following manner:
cµδt ≡ the average probability, to the first order in δt , that a particularcombination Rµ of reactant molecules will react in the next timeinterval δt .
In his original work, Gillespie shows that this definition does in fact have a valid physical
basis and in fact the reaction parameter cµ can be easily connected to the traditional
reaction rate constant kµ (Gillespie, 1976).
The method is based on the joint probability density function P(τ,µ) , defined by
P τ,µ( )dτ ≡ the probability at time t that the next reaction will occur in thedifferential time interval t + τ,t+ τ + dτ( ) and will be of type Rµ .
31
This is a departure from the usual stochastic approach that starts from the
probability function P(X1,X2,K,XN ;t) , defined as the probability that at time t there will
be X1 molecules of S1, X2 molecules of S2, …, and XN molecules of SN . By using
P(τ,µ) as the basis of the approach, it is possible to create a tractable method to compute
the time evolution of the system. To construct a formula for this quantity, Gillespie starts
by defining the quantity hµ as the number of distinct molecular reactant combinations for
the reaction Rµ . This is nothing more than a combinatorial factor and Table 2.1 lists
some example values.
Reaction hµ Reaction order*→ S j 1 ZerothS j → Sk X j First
S j + Sk → Sl X j ⋅ Xk SecondS j + S j → Sk X j X j −1( ) 2 Second
Si + S j + S j → Sk XiX j X j −1( ) 2 Third
Table 2.1 Appropriate combinatorial factors for various reactions. In
actuality, everything can be thought of as a zeroth-, first-, or second-order
reaction, or a sequential combination of these, and there is no need for the higher-
order reactions.
Combining this definition of hµwith the previous definition for the reaction
parameter cµ , leads to the conclusion that the probability, to the first order in δt , that aRµ
reaction will occur in the next time interval time δt is therefore
32
hµcµδt . (2.38)
Now P τ,µ( )dτ can be computed as the product of P0 τ( ) , the probability that no
reaction occurs in the time interval t,t + τ( ) , and hµcµδt , the probability that the specific
reaction Rµ occurs in the next time interval t + τ,t+ τ + dτ( ) :
P τ,µ( )dτ = P0 τ( )hµcµdτ . (2.39)
All that is now required is to calculate the term P0 τ( ) . To construct an expression for this
term, divide the interval t,t + τ( ) into K subintervals, each of length ε = τ K . The
probability that none of the reactions Rµ{ }µ =1
N occurs in the time interval t + jε,t + jε +1( )
(for any arbitrary j) is
1− hiciε + o ε( )[ ]i=1
M
∏ =1− hiciεi=1
M
∑ + o ε( ) . (2.40)
Since there are K subintervals and the probabilities are mutually exclusive,
P0 τ( ) = 1− hiciτK
+ o τK
i=1
M
∑
K
. (2.41)
But as this expression is valid for any K, even infinitely large ones, the expression can
also be written as
P0 τ( ) = limK→∞
1− hiciτ + o K−1( ) K−1
i=1
M
∑
K
K
. (2.42)
However, this is nothing more than one of the limit formulas for the exponential function,
and thus
P0 τ( ) = exp − hicii=1
M
∑ τ
. (2.43)
33
Therefore, after defining
aµ ≡ hµ ⋅ cµ , ao ≡ hi ⋅ cii=1
M
∑ , (2.44)
the result is an expression for P(τ,µ) :
P τ,µ( ) = aµ exp −a0τ[ ] . (2.45)
Implementation
This algorithm can easily be implemented in an efficient modularized form to
accommodate quite large reaction sets of considerable complexity.
For an easy implementation, the joint distribution can be broken into two disjoint
probabilities using Bayes’ rule:
P(τ,µ) = P(τ) ⋅P(µ τ) . (2.46)
But note that the addition property for probabilities can be used to calculate an alternate
form for P(τ) :
P(τ ) = P(τ,µ)µ =1
M
∑ , (2.47)
and substituting this into (2.45) leads to values for its component parts:
P(τ ) = a0 exp −a0τ( ) , (2.48)
P(µ τ) =aµ
a0. (2.49)
Given these fundamental probability density functions, the following algorithm can
be used to carry out the reaction set simulation:
1) Initialization
34
a. Set values for the cµ
b. Set the initial number of the Sµ reactants
c. Set t = 0 , and select a value for tmax , the maximum simulation time
2) Loop
a. Compute aµ ≡ hµ ⋅ cµ , ao ≡ hi ⋅ cii=1
M
∑
b. Generate two random numbers r1 and r2 from a uniform distribution on
0,1[ ]
c. Compute the next time interval τ =1a0ln 1
r1
(Draw from the probability
density function of (2.48))
d. Select the reaction to be run by computing µ such that aνν =1
µ −1
∑ < r2a0 ≤ aνν =1
µ
∑
(Draw from the probability density function of (2.49))
e. Adjust t = t + τ and update the Sµ values according to the Rµ reaction that
just occurred.
f. If t > tmax , then terminate. Otherwise, goto a.
Because the speed of the SSA is linear with respect to the number of reactions,
adding new reaction channels will not greatly increase the runtime of the simulation i.e.,
doubling either the number of reactions or the number of reactant species doubles
(approximately) the total runtime of the algorithm. The speed of the SSA depends more
on the number of molecules. This is seen by noting that the computation of the next time
35
interval in (2c) above depends on the reciprocal of a0, a term comprised of, among other
things, the number of molecules in the simulation. If the reaction set contains at least one
second-order reaction, then a0 will contain at least one product of species population. In
this case the speed of the simulation will fall off like the reciprocal of the square of the
population. However, the runtime can be reduced by noting that not all of the aµ values
will need to be recalculated after each pass, but only the ones for which Sµ appears as a
reactant in the Rµ reaction. An efficient implementation will take advantage of this fact.
Recent improvements to the algorithm, including a method that does not require
the probabilities to be updated after every reaction, are helping to keep the runtime in
check (Gibson and Bruck, 2000; Gillespie, 2001). As currently implemented, a typical
run of the Hox simulation presented in Chapter 3 (without the aforementioned speedups)
consists of over 23 million events, and takes less than 6 minutes on a computer with a
2GHz Pentium 4 processor.
Two important points should be noted about the SSA: the solution of a system of
coupled chemical reactions by this method is entirely equivalent to the solution of the