Top Banner
47

Inferring Parameters for an Elementary Step Model of DNA …schmidtm/Documents/2017_DNA_arrhenius.pdf · The Arrhenius Kinetic Model. We introduce a new model that has locally context-dependent

Oct 22, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
  • Inferring Parameters for an Elementary Step Model of DNA StructureKinetics with Locally Context-Dependent Arrhenius Rates

    Sedigheh Zolaktaf1, Frits Dannenberg2, Xander Rudelis2, Anne Condon1, Joseph MSchae�er3, Mark Schmidt1, Chris Thachuk2, and Erik Winfree2

    1University of British Columbia, Vancouver, BC, Canada

    2California Institute of Technology, Pasadena, CA, USA

    3Autodesk Research, San Francisco, CA, USA

    1

  • Nucleic Acid Kinetic Simulators

    Kinetic simulators aim to predict the kinetics of interacting nucleic acid strandse.g., reaction rate constants.

    2

  • Nucleic Acid Kinetic Simulators

    Kinetic simulators aim to predict the kinetics of interacting nucleic acid strandse.g., reaction rate constants.

    Useful for biological and biotechnological applications.

    [Yurke et al., 2000]

    Predicting kinetics is di�cult (dependent on sequence, temperature, ...).Accurate models of nucleic acid kinetics are required.

    2

  • Nucleic Acid Kinetic Simulators

    Kinetic simulators aim to predict the kinetics of interacting nucleic acid strandse.g., reaction rate constants.

    Useful for biological and biotechnological applications.

    [Yurke et al., 2000]

    Predicting kinetics is di�cult (dependent on sequence, temperature, ...).Accurate models of nucleic acid kinetics are required.

    2

  • Contributions

    We introduce an Arrhenius kinetic model.

    We train kinetic models.We collect a dataset of experimentally determined reaction rate constants.We introduce a computational framework for predicting reaction rate constants.

    Our Arrhenius model performs better than an existing model.

    3

  • Modelling Kinetics of Interacting Nucleic Acid Strands

    Kinetics of interacting strands are modelled as continuous time Markov chains(CTMC) [Schae�er et al., 2015].

    The state space represents non-pseudoknotted secondary structures.

    kij

    kji

    ssstate jstate i

    4

  • Modelling Kinetics of Interacting Nucleic Acid Strands

    Kinetics of interacting strands are modelled as continuous time Markov chains(CTMC) [Schae�er et al., 2015].

    The state space represents non-pseudoknotted secondary structures.

    ssstate jstate i

    kij

    kji

    The transition rates are determined by a kinetic model and obey detailed balance:

    kijkji

    = e−∆G0(j)−∆G0(i)

    RT

    ∆G 0(i): free energy of state i , R: gas constant, T : temperature.

    Estimate reaction rate constants from mean �rst passage times (MFPTs).The MFPT of a CTMC is the average time it takes to reach one of a set of �nal statesfrom an initial state.

    4

  • Modelling Kinetics of Interacting Nucleic Acid Strands

    Kinetics of interacting strands are modelled as continuous time Markov chains(CTMC) [Schae�er et al., 2015].

    The state space represents non-pseudoknotted secondary structures.

    ssstate jstate i

    kij

    kji

    The transition rates are determined by a kinetic model and obey detailed balance:

    kijkji

    = e−∆G0(j)−∆G0(i)

    RT

    ∆G 0(i): free energy of state i , R: gas constant, T : temperature.

    Estimate reaction rate constants from mean �rst passage times (MFPTs).The MFPT of a CTMC is the average time it takes to reach one of a set of �nal statesfrom an initial state.

    4

  • Modelling Kinetics of Interacting Nucleic Acid Strands

    Kinetics of interacting strands are modelled as continuous time Markov chains(CTMC) [Schae�er et al., 2015].

    The state space represents non-pseudoknotted secondary structures.

    ssstate jstate i

    kij

    kji

    The transition rates are determined by a kinetic model and obey detailed balance:

    kijkji

    = e−∆G0(j)−∆G0(i)

    RT

    ∆G 0(i): free energy of state i , R: gas constant, T : temperature.

    Estimate reaction rate constants from mean �rst passage times (MFPTs).The MFPT of a CTMC is the average time it takes to reach one of a set of �nal statesfrom an initial state.

    4

  • The Metropolis Kinetic Model

    A kinetic model is the Metropolis model [Schae�er et al., 2015].

    Bimolecular transitions are given by:

    ssstate j

    state i

    kij = kbiu

    kji

    kbi > 0: bimolecular rate constantu: initial concentration of the reactants

    The model predictions are o� by several orders of magnitude when we train it withour framework.

    5

  • The Metropolis Kinetic Model

    A kinetic model is the Metropolis model [Schae�er et al., 2015].

    For states i and j , if ∆G 0(i) > ∆G 0(j), the unimolecular transition rates are:

    ssstate j

    state i

    kuni > 0: unimolecular rate constant

    Bimolecular transitions are given by:

    ssstate j

    state i

    kij = kbiu

    kji

    kbi > 0: bimolecular rate constantu: initial concentration of the reactants

    The model predictions are o� by several orders of magnitude when we train it withour framework.

    5

  • The Metropolis Kinetic Model

    A kinetic model is the Metropolis model [Schae�er et al., 2015].

    Bimolecular transitions are given by:

    ssstate j

    state i

    kij = kbiu

    kji

    kbi > 0: bimolecular rate constantu: initial concentration of the reactants

    The model predictions are o� by several orders of magnitude when we train it withour framework.

    5

  • The Metropolis Kinetic Model

    A kinetic model is the Metropolis model [Schae�er et al., 2015].

    Bimolecular transitions are given by:

    ssstate j

    state i

    kij = kbiu

    kji

    kbi > 0: bimolecular rate constantu: initial concentration of the reactants

    The model predictions are o� by several orders of magnitude when we train it withour framework.

    5

  • The Arrhenius Kinetic Model

    We introduce a new model that has locally context-dependent Arrhenius rates.Transition rates depend on the pairing status of the bases immediately to left and theright side of the base pair forming or breaking.

    ssstate j

    state i

    kij

    kji

    ssstate j

    state i

    kij

    kji

    loopend

    Our model di�erentiates between seven di�erent half contexts:

    C = {stack, loop, end, stack+loop, ...}

    6

  • The Arrhenius Kinetic Model

    We introduce a new model that has locally context-dependent Arrhenius rates.Transition rates depend on the pairing status of the bases immediately to left and theright side of the base pair forming or breaking.

    ssstate j

    state i

    kij

    kji

    stack

    ssstate j

    state i

    kij

    kji

    loopend

    Our model di�erentiates between seven di�erent half contexts:

    C = {stack, loop, end, stack+loop, ...}

    6

  • The Arrhenius Kinetic Model

    We introduce a new model that has locally context-dependent Arrhenius rates.Transition rates depend on the pairing status of the bases immediately to left and theright side of the base pair forming or breaking.

    ssstate j

    state i

    kij

    kji

    stack stack+loop

    ssstate j

    state i

    kij

    kji

    loopend

    Our model di�erentiates between seven di�erent half contexts:

    C = {stack, loop, end, stack+loop, ...}

    6

  • The Arrhenius Kinetic Model

    We introduce a new model that has locally context-dependent Arrhenius rates.Transition rates depend on the pairing status of the bases immediately to left and theright side of the base pair forming or breaking.

    ssstate j

    state i

    kij

    kji

    end

    ssstate j

    state i

    kij

    kji

    loopend

    Our model di�erentiates between seven di�erent half contexts:

    C = {stack, loop, end, stack+loop, ...}

    6

  • The Arrhenius Kinetic Model

    We introduce a new model that has locally context-dependent Arrhenius rates.Transition rates depend on the pairing status of the bases immediately to left and theright side of the base pair forming or breaking.

    ssstate j

    state i

    kij

    kji

    loopend

    Our model di�erentiates between seven di�erent half contexts:

    C = {stack, loop, end, stack+loop, ...}

    6

  • The Arrhenius Kinetic Model

    We introduce a new model that has locally context-dependent Arrhenius rates.Transition rates depend on the pairing status of the bases immediately to left and theright side of the base pair forming or breaking.

    ssstate j

    state i

    kij

    kji

    loopend

    Our model di�erentiates between seven di�erent half contexts:

    C = {stack, loop, end, stack+loop, ...}

    6

  • The Arrhenius Kinetic Model

    We introduce a new model that has locally context-dependent Arrhenius rates.Transition rates depend on activation energy.

    We rede�ne kbi : C × C → R>0:

    kbi(left, right) = αkuni(left, right)

    α: bimolecular scaling constant

    kbi(end, loop) = αAende−Eend/RTAloope

    −Eloop/RT

    ssstate j

    state i

    kij

    kji

    loopend

    7

  • The Arrhenius Kinetic Model

    We introduce a new model that has locally context-dependent Arrhenius rates.Transition rates depend on activation energy.

    Based on the set of half contexts C, we rede�ne kuni : C × C → R>0 :

    kuni(left, right) = Alefte− Eleft

    RT Arighte−

    ErightRT

    Eleft , Eright : activation energies Aleft , Aright : rate constants

    kuni(stack, stack+loop) = Astacke−

    EstackRT Astack+loope

    −Estack+loop

    RT

    ssstate j

    state i

    kij

    kji

    stack stack+loop

    We rede�ne kbi : C × C → R>0:

    kbi(left, right) = αkuni(left, right)

    α: bimolecular scaling constant

    kbi(end, loop) = αAende−Eend/RTAloope

    −Eloop/RT

    ssstate j

    state i

    kij

    kji

    loopend

    7

  • The Arrhenius Kinetic Model

    We introduce a new model that has locally context-dependent Arrhenius rates.Transition rates depend on activation energy.

    Based on the set of half contexts C, we rede�ne kuni : C × C → R>0 :

    kuni(left, right) = Alefte− Eleft

    RT Arighte−

    ErightRT

    Eleft , Eright : activation energies Aleft , Aright : rate constants

    kuni(stack, stack+loop) = Astacke−

    EstackRT Astack+loope

    −Estack+loop

    RT

    ssstate j

    state i

    kij

    kji

    stack stack+loop

    We rede�ne kbi : C × C → R>0:

    kbi(left, right) = αkuni(left, right)

    α: bimolecular scaling constant

    kbi(end, loop) = αAende−Eend/RTAloope

    −Eloop/RT

    ssstate j

    state i

    kij

    kji

    loopend

    7

  • The Arrhenius Kinetic Model

    We introduce a new model that has locally context-dependent Arrhenius rates.Transition rates depend on activation energy.

    We rede�ne kbi : C × C → R>0:

    kbi(left, right) = αkuni(left, right)

    α: bimolecular scaling constant

    kbi(end, loop) = αAende−Eend/RTAloope

    −Eloop/RT

    ssstate j

    state i

    kij

    kji

    loopend

    7

  • The Arrhenius Kinetic Model

    We introduce a new model that has locally context-dependent Arrhenius rates.Transition rates depend on activation energy.

    We rede�ne kbi : C × C → R>0:

    kbi(left, right) = αkuni(left, right)

    α: bimolecular scaling constant

    kbi(end, loop) = αAende−Eend/RTAloope

    −Eloop/RT

    ssstate j

    state i

    kij

    kji

    loopend

    7

  • Dataset of Experimental Reaction Rate Constants

    |Training set| = 320, |Testing set| = 56

    Hairpin closing and opening [Bonnet et al., 1998],[Bonnet, 2000], [Kim et al., 2006]

    Bubble closing [Altan-Bonnet et al., 2003]

    +

    Helix association and dissociation [Morrisonand Stols, 1993], [Reynaldo et al., 2000]

    +TS

    3

    S

    4

    +TS

    3

    S

    4

    Toehold-mediated 3-way strand displacement[Reynaldo et al., 2000], [Zhang and Winfree, 2009],[Machinek et al., 2014]

    + +

    Toehold-mediated 4-way strand exchange [Dabby, 2013]

    8

  • Dataset of Experimental Reaction Rate Constants

    |Training set| = 320, |Testing set| = 56

    Hairpin closing and opening [Bonnet et al., 1998],[Bonnet, 2000], [Kim et al., 2006]

    Bubble closing [Altan-Bonnet et al., 2003]

    +

    Helix association and dissociation [Morrisonand Stols, 1993], [Reynaldo et al., 2000]

    +TS

    3

    S

    4

    +TS

    3

    S

    4

    Toehold-mediated 3-way strand displacement[Reynaldo et al., 2000], [Zhang and Winfree, 2009],[Machinek et al., 2014]

    + +

    Toehold-mediated 4-way strand exchange [Dabby, 2013]

    8

  • Example: Hairpin Closing and Opening

    3.1 3.2 3.3 3.4 3.5 3.61000/T(K 1)

    101

    102

    103

    104

    105

    k,k

    +(s

    1 )

    T12T16T21T30

    Hairpin closing (solid) and opening (open) [Bonnet et al.,1998]. The legend shows the hairpin loop length.

    9

  • Example: Toehold-mediated 3-way Strand Displacement With Mismatches

    2 4 6 8 10 12 14 16Mismatch position (nt)

    102

    103

    104

    105

    106

    107

    108

    K(M

    1 s1 )

    10nt7nt6nt

    Toehold-mediated 3-way strand displacement withmismatches [Machinek et al., 2014]. The legend shows thelength of the toehold domain.

    Mismatch

    10

  • Framework Overview

    11

  • Predict Reaction Rate Constants

    Instead of stochastic simulations, we estimate mean �rst passage times (MFPTs)with exact solvers [Suhov and Kelbert, 2008].

    Can estimate the MFPT of slow reactions e�ciently.

    We use a reduced state space approach to enable sparse matrix computations.

    Red base pairs can break

    Blue basescan form

    12

  • Predict Reaction Rate Constants

    Instead of stochastic simulations, we estimate mean �rst passage times (MFPTs)with exact solvers [Suhov and Kelbert, 2008].

    Can estimate the MFPT of slow reactions e�ciently.

    We use a reduced state space approach to enable sparse matrix computations.

    Red base pairs can break

    Blue basescan form

    12

  • Predict Reaction Rate Constants

    Instead of stochastic simulations, we estimate mean �rst passage times (MFPTs)with exact solvers [Suhov and Kelbert, 2008].

    Can estimate the MFPT of slow reactions e�ciently.

    We use a reduced state space approach to enable sparse matrix computations.

    Red base pairs can break

    Blue basescan form

    12

  • Predict Reaction Rate Constants

    Instead of stochastic simulations, we estimate mean �rst passage times (MFPTs)with exact solvers [Suhov and Kelbert, 2008].

    Can estimate the MFPT of slow reactions e�ciently.

    We use a reduced state space approach to enable sparse matrix computations.

    Red base pairs can break

    Blue basescan form

    12

  • Evaluate Parameter Sets

    Let θ be the set of parameters in a kinetic model.For the Metropolis model, θ = {ln kuni, ln kbi}.For the Arrhenius model, θ = {lnAl , El | ∀l ∈ C} ∪ {α}.

    Let kr and k̂r be the experimental and predicted reaction rate constant of reaction r ,respectively.

    The prediction error, �r , is the di�erence between log10 kr and log10 k̂r�r ∼ N(0, σ2).

    We also use priors.

    Thus, the log of the posterior distribution on the training set Dtrain is:

    logP(θ, σ|Dtrain) ≈ −1

    2σ2

    ∑r∈Dtrain

    (log10 kr − log10 k̂r

    )2− (n + 1) log σ − λ

    2‖θ‖22

    13

  • Evaluate Parameter Sets

    Let θ be the set of parameters in a kinetic model.For the Metropolis model, θ = {ln kuni, ln kbi}.For the Arrhenius model, θ = {lnAl , El | ∀l ∈ C} ∪ {α}.

    Let kr and k̂r be the experimental and predicted reaction rate constant of reaction r ,respectively.

    The prediction error, �r , is the di�erence between log10 kr and log10 k̂r�r ∼ N(0, σ2).

    We also use priors.

    Thus, the log of the posterior distribution on the training set Dtrain is:

    logP(θ, σ|Dtrain) ≈ −1

    2σ2

    ∑r∈Dtrain

    (log10 kr − log10 k̂r

    )2− (n + 1) log σ − λ

    2‖θ‖22

    13

  • Evaluate Parameter Sets

    Let θ be the set of parameters in a kinetic model.For the Metropolis model, θ = {ln kuni, ln kbi}.For the Arrhenius model, θ = {lnAl , El | ∀l ∈ C} ∪ {α}.

    Let kr and k̂r be the experimental and predicted reaction rate constant of reaction r ,respectively.

    The prediction error, �r , is the di�erence between log10 kr and log10 k̂r�r ∼ N(0, σ2).

    We also use priors.

    Thus, the log of the posterior distribution on the training set Dtrain is:

    logP(θ, σ|Dtrain) ≈ −1

    2σ2

    ∑r∈Dtrain

    (log10 kr − log10 k̂r

    )2− (n + 1) log σ − λ

    2‖θ‖22

    13

  • Evaluate Parameter Sets

    Let θ be the set of parameters in a kinetic model.For the Metropolis model, θ = {ln kuni, ln kbi}.For the Arrhenius model, θ = {lnAl , El | ∀l ∈ C} ∪ {α}.

    Let kr and k̂r be the experimental and predicted reaction rate constant of reaction r ,respectively.

    The prediction error, �r , is the di�erence between log10 kr and log10 k̂r�r ∼ N(0, σ2).

    We also use priors.

    Thus, the log of the posterior distribution on the training set Dtrain is:

    logP(θ, σ|Dtrain) ≈ −1

    2σ2

    ∑r∈Dtrain

    (log10 kr − log10 k̂r

    )2− (n + 1) log σ − λ

    2‖θ‖22

    13

  • Evaluate Parameter Sets

    Let θ be the set of parameters in a kinetic model.For the Metropolis model, θ = {ln kuni, ln kbi}.For the Arrhenius model, θ = {lnAl , El | ∀l ∈ C} ∪ {α}.

    Let kr and k̂r be the experimental and predicted reaction rate constant of reaction r ,respectively.

    The prediction error, �r , is the di�erence between log10 kr and log10 k̂r�r ∼ N(0, σ2).

    We also use priors.

    Thus, the log of the posterior distribution on the training set Dtrain is:

    logP(θ, σ|Dtrain) ≈ −1

    2σ2

    ∑r∈Dtrain

    (log10 kr − log10 k̂r

    )2− (n + 1) log σ − λ

    2‖θ‖22

    13

  • Sample Parameter Sets

    We draw samples from the posterior distribution of the parameters.

    We approximate the expected value of a reaction rate constant by averaging thepredictions of all samples.

    We use the emcee software package [Foreman-Mackey et al., 2013], a Markov chainMonte Carlo (MCMC) ensemble sampler.

    14

  • Results

    Initial is an initial parameter set.

    Ensemble is the MCMC ensemble approach.

    15

  • Hairpin Closing and Opening

    3.1 3.2 3.3 3.4 3.5 3.61000/T(K 1)

    101

    102

    103

    104

    105

    k,k

    +(s

    1 )

    T12T16T21T30

    Hairpin closing (solid) and opening (open) [Bonnet et al.,1998]. The legend shows the hairpin loop length.

    16

  • Metropolis Model Fitting for Hairpin Closing and Opening

    3.1 3.2 3.3 3.4 3.5 3.61000/T(K 1)

    101

    102

    103

    104

    105

    k,k

    +(s

    1 )

    T12T16T21T30

    3.1 3.2 3.3 3.4 3.5 3.61000/T(K 1)

    101

    102

    103

    104

    105

    k,k

    +(s

    1 )

    Metropolis with MCMC Ensemble

    17

  • Arrhenius Model Fitting for Hairpin Closing and Opening

    3.1 3.2 3.3 3.4 3.5 3.61000/T(K 1)

    101

    102

    103

    104

    105

    k,k

    +(s

    1 )

    T12T16T21T30

    3.1 3.2 3.3 3.4 3.5 3.61000/T(K 1)

    101

    102

    103

    104

    105

    k,k

    +(s

    1 )

    Arrhenius with MCMC Ensemble

    18

  • Example: Toehold-mediated 3-way Strand Displacement With Mismatches

    2 4 6 8 10 12 14 16Mismatch position (nt)

    102

    103

    104

    105

    106

    107

    108

    K(M

    1 s1 )

    10nt7nt6nt

    Toehold-mediated 3-way strand displacement withmismatches [Machinek et al., 2014]. The legend shows thelength of the toehold domain.

    Mismatch

    19

  • Model Predictions for Toehold-mediated 3-way Strand Displacement WithMismatches

    2 4 6 8 10 12 14 16Mismatch position (nt)

    102

    103

    104

    105

    106

    107

    108

    K(M

    1 s1 )

    10nt7nt6nt

    2 4 6 8 10 12 14 16Mismatch position (nt)

    102

    103

    104

    105

    106

    107

    108

    K(M

    1 s1 )

    Metropolis with MCMC Ensemble

    2 4 6 8 10 12 14 16Mismatch position (nt)

    102

    103

    104

    105

    106

    107

    108

    K(M

    1 s1 )

    Arrhenius with MCMC Ensemble

    20

  • Summary

    We introduce an Arrhenius kinetic model.

    We train kinetic models.We collect a dataset of experimentally determined reaction rate constants.We introduce a computational framework for predicting reaction rate constants.

    Our Arrhenius model performs better than an existing model.

    22

  • Thank You!

    23

    fd@rm@0: