Inferring Parameters for an Elementary Step Model of DNA StructureKinetics with Locally Context-Dependent Arrhenius Rates
Sedigheh Zolaktaf1, Frits Dannenberg2, Xander Rudelis2, Anne Condon1, Joseph MSchae�er3, Mark Schmidt1, Chris Thachuk2, and Erik Winfree2
1University of British Columbia, Vancouver, BC, Canada
2California Institute of Technology, Pasadena, CA, USA
3Autodesk Research, San Francisco, CA, USA
1
Nucleic Acid Kinetic Simulators
Kinetic simulators aim to predict the kinetics of interacting nucleic acid strandse.g., reaction rate constants.
2
Nucleic Acid Kinetic Simulators
Kinetic simulators aim to predict the kinetics of interacting nucleic acid strandse.g., reaction rate constants.
Useful for biological and biotechnological applications.
[Yurke et al., 2000]
Predicting kinetics is di�cult (dependent on sequence, temperature, ...).Accurate models of nucleic acid kinetics are required.
2
Nucleic Acid Kinetic Simulators
Kinetic simulators aim to predict the kinetics of interacting nucleic acid strandse.g., reaction rate constants.
Useful for biological and biotechnological applications.
[Yurke et al., 2000]
Predicting kinetics is di�cult (dependent on sequence, temperature, ...).Accurate models of nucleic acid kinetics are required.
2
Contributions
We introduce an Arrhenius kinetic model.
We train kinetic models.We collect a dataset of experimentally determined reaction rate constants.We introduce a computational framework for predicting reaction rate constants.
Our Arrhenius model performs better than an existing model.
3
Modelling Kinetics of Interacting Nucleic Acid Strands
Kinetics of interacting strands are modelled as continuous time Markov chains(CTMC) [Schae�er et al., 2015].
The state space represents non-pseudoknotted secondary structures.
kij
kji
ssstate jstate i
4
Modelling Kinetics of Interacting Nucleic Acid Strands
Kinetics of interacting strands are modelled as continuous time Markov chains(CTMC) [Schae�er et al., 2015].
The state space represents non-pseudoknotted secondary structures.
ssstate jstate i
kij
kji
The transition rates are determined by a kinetic model and obey detailed balance:
kijkji
= e−∆G0(j)−∆G0(i)
RT
∆G 0(i): free energy of state i , R: gas constant, T : temperature.
Estimate reaction rate constants from mean �rst passage times (MFPTs).The MFPT of a CTMC is the average time it takes to reach one of a set of �nal statesfrom an initial state.
4
Modelling Kinetics of Interacting Nucleic Acid Strands
Kinetics of interacting strands are modelled as continuous time Markov chains(CTMC) [Schae�er et al., 2015].
The state space represents non-pseudoknotted secondary structures.
ssstate jstate i
kij
kji
The transition rates are determined by a kinetic model and obey detailed balance:
kijkji
= e−∆G0(j)−∆G0(i)
RT
∆G 0(i): free energy of state i , R: gas constant, T : temperature.
Estimate reaction rate constants from mean �rst passage times (MFPTs).The MFPT of a CTMC is the average time it takes to reach one of a set of �nal statesfrom an initial state.
4
Modelling Kinetics of Interacting Nucleic Acid Strands
Kinetics of interacting strands are modelled as continuous time Markov chains(CTMC) [Schae�er et al., 2015].
The state space represents non-pseudoknotted secondary structures.
ssstate jstate i
kij
kji
The transition rates are determined by a kinetic model and obey detailed balance:
kijkji
= e−∆G0(j)−∆G0(i)
RT
∆G 0(i): free energy of state i , R: gas constant, T : temperature.
Estimate reaction rate constants from mean �rst passage times (MFPTs).The MFPT of a CTMC is the average time it takes to reach one of a set of �nal statesfrom an initial state.
4
The Metropolis Kinetic Model
A kinetic model is the Metropolis model [Schae�er et al., 2015].
Bimolecular transitions are given by:
ssstate j
state i
kij = kbiu
kji
kbi > 0: bimolecular rate constantu: initial concentration of the reactants
The model predictions are o� by several orders of magnitude when we train it withour framework.
5
The Metropolis Kinetic Model
A kinetic model is the Metropolis model [Schae�er et al., 2015].
For states i and j , if ∆G 0(i) > ∆G 0(j), the unimolecular transition rates are:
ssstate j
state i
kuni > 0: unimolecular rate constant
Bimolecular transitions are given by:
ssstate j
state i
kij = kbiu
kji
kbi > 0: bimolecular rate constantu: initial concentration of the reactants
The model predictions are o� by several orders of magnitude when we train it withour framework.
5
The Metropolis Kinetic Model
A kinetic model is the Metropolis model [Schae�er et al., 2015].
Bimolecular transitions are given by:
ssstate j
state i
kij = kbiu
kji
kbi > 0: bimolecular rate constantu: initial concentration of the reactants
The model predictions are o� by several orders of magnitude when we train it withour framework.
5
The Metropolis Kinetic Model
A kinetic model is the Metropolis model [Schae�er et al., 2015].
Bimolecular transitions are given by:
ssstate j
state i
kij = kbiu
kji
kbi > 0: bimolecular rate constantu: initial concentration of the reactants
The model predictions are o� by several orders of magnitude when we train it withour framework.
5
The Arrhenius Kinetic Model
We introduce a new model that has locally context-dependent Arrhenius rates.Transition rates depend on the pairing status of the bases immediately to left and theright side of the base pair forming or breaking.
ssstate j
state i
kij
kji
ssstate j
state i
kij
kji
loopend
Our model di�erentiates between seven di�erent half contexts:
C = {stack, loop, end, stack+loop, ...}
6
The Arrhenius Kinetic Model
We introduce a new model that has locally context-dependent Arrhenius rates.Transition rates depend on the pairing status of the bases immediately to left and theright side of the base pair forming or breaking.
ssstate j
state i
kij
kji
stack
ssstate j
state i
kij
kji
loopend
Our model di�erentiates between seven di�erent half contexts:
C = {stack, loop, end, stack+loop, ...}
6
The Arrhenius Kinetic Model
We introduce a new model that has locally context-dependent Arrhenius rates.Transition rates depend on the pairing status of the bases immediately to left and theright side of the base pair forming or breaking.
ssstate j
state i
kij
kji
stack stack+loop
ssstate j
state i
kij
kji
loopend
Our model di�erentiates between seven di�erent half contexts:
C = {stack, loop, end, stack+loop, ...}
6
The Arrhenius Kinetic Model
We introduce a new model that has locally context-dependent Arrhenius rates.Transition rates depend on the pairing status of the bases immediately to left and theright side of the base pair forming or breaking.
ssstate j
state i
kij
kji
end
ssstate j
state i
kij
kji
loopend
Our model di�erentiates between seven di�erent half contexts:
C = {stack, loop, end, stack+loop, ...}
6
The Arrhenius Kinetic Model
We introduce a new model that has locally context-dependent Arrhenius rates.Transition rates depend on the pairing status of the bases immediately to left and theright side of the base pair forming or breaking.
ssstate j
state i
kij
kji
loopend
Our model di�erentiates between seven di�erent half contexts:
C = {stack, loop, end, stack+loop, ...}
6
The Arrhenius Kinetic Model
We introduce a new model that has locally context-dependent Arrhenius rates.Transition rates depend on the pairing status of the bases immediately to left and theright side of the base pair forming or breaking.
ssstate j
state i
kij
kji
loopend
Our model di�erentiates between seven di�erent half contexts:
C = {stack, loop, end, stack+loop, ...}
6
The Arrhenius Kinetic Model
We introduce a new model that has locally context-dependent Arrhenius rates.Transition rates depend on activation energy.
We rede�ne kbi : C × C → R>0:
kbi(left, right) = αkuni(left, right)
α: bimolecular scaling constant
kbi(end, loop) = αAende−Eend/RTAloope
−Eloop/RT
ssstate j
state i
kij
kji
loopend
7
The Arrhenius Kinetic Model
We introduce a new model that has locally context-dependent Arrhenius rates.Transition rates depend on activation energy.
Based on the set of half contexts C, we rede�ne kuni : C × C → R>0 :
kuni(left, right) = Alefte− Eleft
RT Arighte−
ErightRT
Eleft , Eright : activation energies Aleft , Aright : rate constants
kuni(stack, stack+loop) = Astacke−
EstackRT Astack+loope
−Estack+loop
RT
ssstate j
state i
kij
kji
stack stack+loop
We rede�ne kbi : C × C → R>0:
kbi(left, right) = αkuni(left, right)
α: bimolecular scaling constant
kbi(end, loop) = αAende−Eend/RTAloope
−Eloop/RT
ssstate j
state i
kij
kji
loopend
7
The Arrhenius Kinetic Model
We introduce a new model that has locally context-dependent Arrhenius rates.Transition rates depend on activation energy.
Based on the set of half contexts C, we rede�ne kuni : C × C → R>0 :
kuni(left, right) = Alefte− Eleft
RT Arighte−
ErightRT
Eleft , Eright : activation energies Aleft , Aright : rate constants
kuni(stack, stack+loop) = Astacke−
EstackRT Astack+loope
−Estack+loop
RT
ssstate j
state i
kij
kji
stack stack+loop
We rede�ne kbi : C × C → R>0:
kbi(left, right) = αkuni(left, right)
α: bimolecular scaling constant
kbi(end, loop) = αAende−Eend/RTAloope
−Eloop/RT
ssstate j
state i
kij
kji
loopend
7
The Arrhenius Kinetic Model
We introduce a new model that has locally context-dependent Arrhenius rates.Transition rates depend on activation energy.
We rede�ne kbi : C × C → R>0:
kbi(left, right) = αkuni(left, right)
α: bimolecular scaling constant
kbi(end, loop) = αAende−Eend/RTAloope
−Eloop/RT
ssstate j
state i
kij
kji
loopend
7
The Arrhenius Kinetic Model
We introduce a new model that has locally context-dependent Arrhenius rates.Transition rates depend on activation energy.
We rede�ne kbi : C × C → R>0:
kbi(left, right) = αkuni(left, right)
α: bimolecular scaling constant
kbi(end, loop) = αAende−Eend/RTAloope
−Eloop/RT
ssstate j
state i
kij
kji
loopend
7
Dataset of Experimental Reaction Rate Constants
|Training set| = 320, |Testing set| = 56
Hairpin closing and opening [Bonnet et al., 1998],[Bonnet, 2000], [Kim et al., 2006]
Bubble closing [Altan-Bonnet et al., 2003]
+
Helix association and dissociation [Morrisonand Stols, 1993], [Reynaldo et al., 2000]
+TS
3
S
4
+TS
3
S
4
Toehold-mediated 3-way strand displacement[Reynaldo et al., 2000], [Zhang and Winfree, 2009],[Machinek et al., 2014]
+ +
Toehold-mediated 4-way strand exchange [Dabby, 2013]
8
Dataset of Experimental Reaction Rate Constants
|Training set| = 320, |Testing set| = 56
Hairpin closing and opening [Bonnet et al., 1998],[Bonnet, 2000], [Kim et al., 2006]
Bubble closing [Altan-Bonnet et al., 2003]
+
Helix association and dissociation [Morrisonand Stols, 1993], [Reynaldo et al., 2000]
+TS
3
S
4
+TS
3
S
4
Toehold-mediated 3-way strand displacement[Reynaldo et al., 2000], [Zhang and Winfree, 2009],[Machinek et al., 2014]
+ +
Toehold-mediated 4-way strand exchange [Dabby, 2013]
8
Example: Hairpin Closing and Opening
3.1 3.2 3.3 3.4 3.5 3.61000/T(K 1)
101
102
103
104
105
k,k
+(s
1 )
T12T16T21T30
Hairpin closing (solid) and opening (open) [Bonnet et al.,1998]. The legend shows the hairpin loop length.
9
Example: Toehold-mediated 3-way Strand Displacement With Mismatches
2 4 6 8 10 12 14 16Mismatch position (nt)
102
103
104
105
106
107
108
K(M
1 s1 )
10nt7nt6nt
Toehold-mediated 3-way strand displacement withmismatches [Machinek et al., 2014]. The legend shows thelength of the toehold domain.
Mismatch
10
Framework Overview
11
Predict Reaction Rate Constants
Instead of stochastic simulations, we estimate mean �rst passage times (MFPTs)with exact solvers [Suhov and Kelbert, 2008].
Can estimate the MFPT of slow reactions e�ciently.
We use a reduced state space approach to enable sparse matrix computations.
Red base pairs can break
Blue basescan form
12
Predict Reaction Rate Constants
Instead of stochastic simulations, we estimate mean �rst passage times (MFPTs)with exact solvers [Suhov and Kelbert, 2008].
Can estimate the MFPT of slow reactions e�ciently.
We use a reduced state space approach to enable sparse matrix computations.
Red base pairs can break
Blue basescan form
12
Predict Reaction Rate Constants
Instead of stochastic simulations, we estimate mean �rst passage times (MFPTs)with exact solvers [Suhov and Kelbert, 2008].
Can estimate the MFPT of slow reactions e�ciently.
We use a reduced state space approach to enable sparse matrix computations.
Red base pairs can break
Blue basescan form
12
Predict Reaction Rate Constants
Instead of stochastic simulations, we estimate mean �rst passage times (MFPTs)with exact solvers [Suhov and Kelbert, 2008].
Can estimate the MFPT of slow reactions e�ciently.
We use a reduced state space approach to enable sparse matrix computations.
Red base pairs can break
Blue basescan form
12
Evaluate Parameter Sets
Let θ be the set of parameters in a kinetic model.For the Metropolis model, θ = {ln kuni, ln kbi}.For the Arrhenius model, θ = {lnAl , El | ∀l ∈ C} ∪ {α}.
Let kr and k̂r be the experimental and predicted reaction rate constant of reaction r ,respectively.
The prediction error, �r , is the di�erence between log10 kr and log10 k̂r�r ∼ N(0, σ2).
We also use priors.
Thus, the log of the posterior distribution on the training set Dtrain is:
logP(θ, σ|Dtrain) ≈ −1
2σ2
∑r∈Dtrain
(log10 kr − log10 k̂r
)2− (n + 1) log σ − λ
2‖θ‖22
13
Evaluate Parameter Sets
Let θ be the set of parameters in a kinetic model.For the Metropolis model, θ = {ln kuni, ln kbi}.For the Arrhenius model, θ = {lnAl , El | ∀l ∈ C} ∪ {α}.
Let kr and k̂r be the experimental and predicted reaction rate constant of reaction r ,respectively.
The prediction error, �r , is the di�erence between log10 kr and log10 k̂r�r ∼ N(0, σ2).
We also use priors.
Thus, the log of the posterior distribution on the training set Dtrain is:
logP(θ, σ|Dtrain) ≈ −1
2σ2
∑r∈Dtrain
(log10 kr − log10 k̂r
)2− (n + 1) log σ − λ
2‖θ‖22
13
Evaluate Parameter Sets
Let θ be the set of parameters in a kinetic model.For the Metropolis model, θ = {ln kuni, ln kbi}.For the Arrhenius model, θ = {lnAl , El | ∀l ∈ C} ∪ {α}.
Let kr and k̂r be the experimental and predicted reaction rate constant of reaction r ,respectively.
The prediction error, �r , is the di�erence between log10 kr and log10 k̂r�r ∼ N(0, σ2).
We also use priors.
Thus, the log of the posterior distribution on the training set Dtrain is:
logP(θ, σ|Dtrain) ≈ −1
2σ2
∑r∈Dtrain
(log10 kr − log10 k̂r
)2− (n + 1) log σ − λ
2‖θ‖22
13
Evaluate Parameter Sets
Let θ be the set of parameters in a kinetic model.For the Metropolis model, θ = {ln kuni, ln kbi}.For the Arrhenius model, θ = {lnAl , El | ∀l ∈ C} ∪ {α}.
Let kr and k̂r be the experimental and predicted reaction rate constant of reaction r ,respectively.
The prediction error, �r , is the di�erence between log10 kr and log10 k̂r�r ∼ N(0, σ2).
We also use priors.
Thus, the log of the posterior distribution on the training set Dtrain is:
logP(θ, σ|Dtrain) ≈ −1
2σ2
∑r∈Dtrain
(log10 kr − log10 k̂r
)2− (n + 1) log σ − λ
2‖θ‖22
13
Evaluate Parameter Sets
Let θ be the set of parameters in a kinetic model.For the Metropolis model, θ = {ln kuni, ln kbi}.For the Arrhenius model, θ = {lnAl , El | ∀l ∈ C} ∪ {α}.
Let kr and k̂r be the experimental and predicted reaction rate constant of reaction r ,respectively.
The prediction error, �r , is the di�erence between log10 kr and log10 k̂r�r ∼ N(0, σ2).
We also use priors.
Thus, the log of the posterior distribution on the training set Dtrain is:
logP(θ, σ|Dtrain) ≈ −1
2σ2
∑r∈Dtrain
(log10 kr − log10 k̂r
)2− (n + 1) log σ − λ
2‖θ‖22
13
Sample Parameter Sets
We draw samples from the posterior distribution of the parameters.
We approximate the expected value of a reaction rate constant by averaging thepredictions of all samples.
We use the emcee software package [Foreman-Mackey et al., 2013], a Markov chainMonte Carlo (MCMC) ensemble sampler.
14
Results
Initial is an initial parameter set.
Ensemble is the MCMC ensemble approach.
15
Hairpin Closing and Opening
3.1 3.2 3.3 3.4 3.5 3.61000/T(K 1)
101
102
103
104
105
k,k
+(s
1 )
T12T16T21T30
Hairpin closing (solid) and opening (open) [Bonnet et al.,1998]. The legend shows the hairpin loop length.
16
Metropolis Model Fitting for Hairpin Closing and Opening
3.1 3.2 3.3 3.4 3.5 3.61000/T(K 1)
101
102
103
104
105
k,k
+(s
1 )
T12T16T21T30
3.1 3.2 3.3 3.4 3.5 3.61000/T(K 1)
101
102
103
104
105
k,k
+(s
1 )
Metropolis with MCMC Ensemble
17
Arrhenius Model Fitting for Hairpin Closing and Opening
3.1 3.2 3.3 3.4 3.5 3.61000/T(K 1)
101
102
103
104
105
k,k
+(s
1 )
T12T16T21T30
3.1 3.2 3.3 3.4 3.5 3.61000/T(K 1)
101
102
103
104
105
k,k
+(s
1 )
Arrhenius with MCMC Ensemble
18
Example: Toehold-mediated 3-way Strand Displacement With Mismatches
2 4 6 8 10 12 14 16Mismatch position (nt)
102
103
104
105
106
107
108
K(M
1 s1 )
10nt7nt6nt
Toehold-mediated 3-way strand displacement withmismatches [Machinek et al., 2014]. The legend shows thelength of the toehold domain.
Mismatch
19
Model Predictions for Toehold-mediated 3-way Strand Displacement WithMismatches
2 4 6 8 10 12 14 16Mismatch position (nt)
102
103
104
105
106
107
108
K(M
1 s1 )
10nt7nt6nt
2 4 6 8 10 12 14 16Mismatch position (nt)
102
103
104
105
106
107
108
K(M
1 s1 )
Metropolis with MCMC Ensemble
2 4 6 8 10 12 14 16Mismatch position (nt)
102
103
104
105
106
107
108
K(M
1 s1 )
Arrhenius with MCMC Ensemble
20
Summary
We introduce an Arrhenius kinetic model.
We train kinetic models.We collect a dataset of experimentally determined reaction rate constants.We introduce a computational framework for predicting reaction rate constants.
Our Arrhenius model performs better than an existing model.
22
Thank You!
23
fd@rm@0: