Inferring Parameters for an Elementary Step Model of DNA …schmidtm/Documents/2017_DNA_arrhenius.pdf · The Arrhenius Kinetic Model. We introduce a new model that has locally context-dependent

Inferring Parameters for an Elementary Step Model of DNA StructureKinetics with Locally Context-Dependent Arrhenius Rates

Sedigheh Zolaktaf1, Frits Dannenberg2, Xander Rudelis2, Anne Condon1, Joseph MSchae�er3, Mark Schmidt1, Chris Thachuk2, and Erik Winfree2

1University of British Columbia, Vancouver, BC, Canada

2California Institute of Technology, Pasadena, CA, USA

3Autodesk Research, San Francisco, CA, USA

1

Nucleic Acid Kinetic Simulators

Kinetic simulators aim to predict the kinetics of interacting nucleic acid strandse.g., reaction rate constants.

2

Useful for biological and biotechnological applications.

[Yurke et al., 2000]

Predicting kinetics is di�cult (dependent on sequence, temperature, ...).Accurate models of nucleic acid kinetics are required.

2

Useful for biological and biotechnological applications.

[Yurke et al., 2000]

Predicting kinetics is di�cult (dependent on sequence, temperature, ...).Accurate models of nucleic acid kinetics are required.

2

Contributions

We introduce an Arrhenius kinetic model.

We train kinetic models.We collect a dataset of experimentally determined reaction rate constants.We introduce a computational framework for predicting reaction rate constants.

Our Arrhenius model performs better than an existing model.

3

Modelling Kinetics of Interacting Nucleic Acid Strands

Kinetics of interacting strands are modelled as continuous time Markov chains(CTMC) [Schae�er et al., 2015].

The state space represents non-pseudoknotted secondary structures.

kij

kji

ssstate jstate i

4

ssstate jstate i

kij

kji

The transition rates are determined by a kinetic model and obey detailed balance:

kijkji

= e−∆G0(j)−∆G0(i)

RT

∆G 0(i): free energy of state i , R: gas constant, T : temperature.

Estimate reaction rate constants from mean �rst passage times (MFPTs).The MFPT of a CTMC is the average time it takes to reach one of a set of �nal statesfrom an initial state.

4

ssstate jstate i

kij

kji

kijkji

= e−∆G0(j)−∆G0(i)

RT

4

ssstate jstate i

kij

kji

kijkji

= e−∆G0(j)−∆G0(i)

RT

4

The Metropolis Kinetic Model

A kinetic model is the Metropolis model [Schae�er et al., 2015].

Bimolecular transitions are given by:

ssstate j

state i

kij = kbiu

kji

kbi > 0: bimolecular rate constantu: initial concentration of the reactants

The model predictions are o� by several orders of magnitude when we train it withour framework.

5

For states i and j , if ∆G 0(i) > ∆G 0(j), the unimolecular transition rates are:

ssstate j

state i

kuni > 0: unimolecular rate constant

ssstate j

state i

kij = kbiu

kji

5

ssstate j

state i

kij = kbiu

kji

5

ssstate j

state i

kij = kbiu

kji

5

The Arrhenius Kinetic Model

We introduce a new model that has locally context-dependent Arrhenius rates.Transition rates depend on the pairing status of the bases immediately to left and theright side of the base pair forming or breaking.

ssstate j

state i

kij

kji

ssstate j

state i

kij

kji

loopend

Our model di�erentiates between seven di�erent half contexts:

C = {stack, loop, end, stack+loop, ...}

6

ssstate j

state i

kij

kji

stack

ssstate j

state i

kij

kji

loopend

6

ssstate j

state i

kij

kji

stack stack+loop

ssstate j

state i

kij

kji

loopend

6

ssstate j

state i

kij

kji

end

ssstate j

state i

kij

kji

loopend

6

ssstate j

state i

kij

kji

loopend

6

ssstate j

state i

kij

kji

loopend

6

We introduce a new model that has locally context-dependent Arrhenius rates.Transition rates depend on activation energy.

We rede�ne kbi : C × C → R>0:

kbi(left, right) = αkuni(left, right)

α: bimolecular scaling constant

kbi(end, loop) = αAende−Eend/RTAloope

−Eloop/RT

ssstate j

state i

kij

kji

loopend

7

Based on the set of half contexts C, we rede�ne kuni : C × C → R>0 :

kuni(left, right) = Alefte− Eleft

RT Arighte−

ErightRT

Eleft , Eright : activation energies Aleft , Aright : rate constants

kuni(stack, stack+loop) = Astacke−

EstackRT Astack+loope

−Estack+loop

RT

ssstate j

state i

kij

kji

stack stack+loop

−Eloop/RT

ssstate j

state i

kij

kji

loopend

7

Based on the set of half contexts C, we rede�ne kuni : C × C → R>0 :

kuni(left, right) = Alefte− Eleft

RT Arighte−

ErightRT

Eleft , Eright : activation energies Aleft , Aright : rate constants

kuni(stack, stack+loop) = Astacke−

EstackRT Astack+loope

−Estack+loop

RT

ssstate j

state i

kij

kji

stack stack+loop

−Eloop/RT

ssstate j

state i

kij

kji

loopend

7

−Eloop/RT

ssstate j

state i

kij

kji

loopend

7

−Eloop/RT

ssstate j

state i

kij

kji

loopend

7

Dataset of Experimental Reaction Rate Constants

|Training set| = 320, |Testing set| = 56

Hairpin closing and opening [Bonnet et al., 1998],[Bonnet, 2000], [Kim et al., 2006]

Bubble closing [Altan-Bonnet et al., 2003]

+

Helix association and dissociation [Morrisonand Stols, 1993], [Reynaldo et al., 2000]

+TS

3

S

4

+TS

3

S

4

Toehold-mediated 3-way strand displacement[Reynaldo et al., 2000], [Zhang and Winfree, 2009],[Machinek et al., 2014]

+ +

Toehold-mediated 4-way strand exchange [Dabby, 2013]

8

Dataset of Experimental Reaction Rate Constants

|Training set| = 320, |Testing set| = 56

Hairpin closing and opening [Bonnet et al., 1998],[Bonnet, 2000], [Kim et al., 2006]

Bubble closing [Altan-Bonnet et al., 2003]

+

Helix association and dissociation [Morrisonand Stols, 1993], [Reynaldo et al., 2000]

+TS

3

S

4

+TS

3

S

4

Toehold-mediated 3-way strand displacement[Reynaldo et al., 2000], [Zhang and Winfree, 2009],[Machinek et al., 2014]

+ +

Toehold-mediated 4-way strand exchange [Dabby, 2013]

8

Example: Hairpin Closing and Opening

3.1 3.2 3.3 3.4 3.5 3.61000/T(K 1)

101

102

103

104

105

k,k

+(s

1 )

T12T16T21T30

Hairpin closing (solid) and opening (open) [Bonnet et al.,1998]. The legend shows the hairpin loop length.

9

Example: Toehold-mediated 3-way Strand Displacement With Mismatches

2 4 6 8 10 12 14 16Mismatch position (nt)

102

103

104

105

106

107

108

K(M

1 s1 )

10nt7nt6nt

Toehold-mediated 3-way strand displacement withmismatches [Machinek et al., 2014]. The legend shows thelength of the toehold domain.

Mismatch

10

Framework Overview

11

Predict Reaction Rate Constants

Instead of stochastic simulations, we estimate mean �rst passage times (MFPTs)with exact solvers [Suhov and Kelbert, 2008].

Can estimate the MFPT of slow reactions e�ciently.

We use a reduced state space approach to enable sparse matrix computations.

Red base pairs can break

Blue basescan form

12

Blue basescan form

12

Blue basescan form

12

Blue basescan form

12

Evaluate Parameter Sets

Let θ be the set of parameters in a kinetic model.For the Metropolis model, θ = {ln kuni, ln kbi}.For the Arrhenius model, θ = {lnAl , El | ∀l ∈ C} ∪ {α}.

Let kr and k̂r be the experimental and predicted reaction rate constant of reaction r ,respectively.

The prediction error, �r , is the di�erence between log10 kr and log10 k̂r�r ∼ N(0, σ2).

We also use priors.

Thus, the log of the posterior distribution on the training set Dtrain is:

logP(θ, σ|Dtrain) ≈ −1

2σ2

∑r∈Dtrain

(log10 kr − log10 k̂r

)2− (n + 1) log σ − λ

2‖θ‖22

13

We also use priors.

2σ2

∑r∈Dtrain

)2− (n + 1) log σ − λ

2‖θ‖22

13

We also use priors.

2σ2

∑r∈Dtrain

)2− (n + 1) log σ − λ

2‖θ‖22

13

We also use priors.

2σ2

∑r∈Dtrain

)2− (n + 1) log σ − λ

2‖θ‖22

13

We also use priors.

2σ2

∑r∈Dtrain

)2− (n + 1) log σ − λ

2‖θ‖22

13

Sample Parameter Sets

We draw samples from the posterior distribution of the parameters.

We approximate the expected value of a reaction rate constant by averaging thepredictions of all samples.

We use the emcee software package [Foreman-Mackey et al., 2013], a Markov chainMonte Carlo (MCMC) ensemble sampler.

14

Results

Initial is an initial parameter set.

Ensemble is the MCMC ensemble approach.

15

Hairpin Closing and Opening

3.1 3.2 3.3 3.4 3.5 3.61000/T(K 1)

101

102

103

104

105

k,k

+(s

1 )

T12T16T21T30

Hairpin closing (solid) and opening (open) [Bonnet et al.,1998]. The legend shows the hairpin loop length.

16

Metropolis Model Fitting for Hairpin Closing and Opening

3.1 3.2 3.3 3.4 3.5 3.61000/T(K 1)

101

102

103

104

105

k,k

+(s

1 )

T12T16T21T30

3.1 3.2 3.3 3.4 3.5 3.61000/T(K 1)

101

102

103

104

105

k,k

+(s

1 )

Metropolis with MCMC Ensemble

17

Arrhenius Model Fitting for Hairpin Closing and Opening

3.1 3.2 3.3 3.4 3.5 3.61000/T(K 1)

101

102

103

104

105

k,k

+(s

1 )

T12T16T21T30

3.1 3.2 3.3 3.4 3.5 3.61000/T(K 1)

101

102

103

104

105

k,k

+(s

1 )

Arrhenius with MCMC Ensemble

18

Example: Toehold-mediated 3-way Strand Displacement With Mismatches

102

103

104

105

106

107

108

K(M

1 s1 )

10nt7nt6nt

Toehold-mediated 3-way strand displacement withmismatches [Machinek et al., 2014]. The legend shows thelength of the toehold domain.

Mismatch

19

Model Predictions for Toehold-mediated 3-way Strand Displacement WithMismatches

102

103

104

105

106

107

108

K(M

1 s1 )

10nt7nt6nt

102

103

104

105

106

107

108

K(M

1 s1 )

Metropolis with MCMC Ensemble

102

103

104

105

106

107

108

K(M

1 s1 )

Arrhenius with MCMC Ensemble

20

Summary

We introduce an Arrhenius kinetic model.

We train kinetic models.We collect a dataset of experimentally determined reaction rate constants.We introduce a computational framework for predicting reaction rate constants.

Our Arrhenius model performs better than an existing model.

22

Thank You!

23

fd@rm@0: