François Fages Les Houches, avril 2007 Formal Verification of Dynamical Models and Application to Cell Cycle Control François Fages, Sylvain Soliman Constraint Programming Group, INRIA Rocquencourt mailto:[email protected]http://contraintes.inria.fr/ Main idea: to master the complexity of biological systems investigate • Programming Language Concepts • Formal Methods of Circuit and Program Verification • Automated Reasoning Tools Prototype Implementation in the Biochemical Abstract Machine BIOCHAM modeling environment available at http://contraintes.inria.fr/BIOCHAM
116
Embed
François Fages Les Houches, avril 2007 Formal Verification of Dynamical Models and Application to Cell Cycle Control François Fages, Sylvain Soliman Constraint.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
François Fages Les Houches, avril 2007
Formal Verification of Dynamical Models and Application to Cell Cycle Control
François Fages, Sylvain SolimanConstraint Programming Group, INRIA Rocquencourt
Main idea: to master the complexity of biological systems investigate• Programming Language Concepts• Formal Methods of Circuit and Program Verification• Automated Reasoning Tools
Prototype Implementation in the Biochemical Abstract Machine BIOCHAMmodeling environment available at http://contraintes.inria.fr/BIOCHAM
François Fages Les Houches, avril 2007
Systems Biology
“Systems Biology aims at systems-level understanding [which]
requires a set of principles and methodologies that links the
behaviors of molecules to systems characteristics and functions.”
H. Kitano, ICSB 2000
• Analyze (post-)genomic data produced with high-throughput technologies (stored in databases like GO, KEGG, BioCyc, etc.);
• Integrate heterogeneous data about a specific problem;
• Understand and Predict behaviors or interactions in big networks of genes or proteins.
Systems Biology Markup Language (SBML) : exchange format for reaction models
François Fages Les Houches, avril 2007
Issue of Abstraction
Models are built in Systems Biology with two contradictory perspectives :
François Fages Les Houches, avril 2007
Issue of Abstraction
Models are built in Systems Biology with two contradictory perspectives :
1) Models for representing knowledge : the more concrete the better
François Fages Les Houches, avril 2007
Issue of Abstraction
Models are built in Systems Biology with two contradictory perspectives :
1) Models for representing knowledge : the more concrete the better
2) Models for making predictions : the more abstract the better !
François Fages Les Houches, avril 2007
Issue of Abstraction
Models are built in Systems Biology with two contradictory perspectives :
1) Models for representing knowledge : the more concrete the better
2) Models for making predictions : the more abstract the better !
These perspectives can be reconciled by organizing formalisms and models into hierarchies of abstractions.
To understand a system is not to know everything about it but to know
abstraction levels that are sufficient for answering questions about it
François Fages Les Houches, avril 2007
Language-based Approaches to Cell Systems Biology
Qualitative models: from diagrammatic notation to• Boolean networks [Kaufman 69, Thomas 73] Petri Nets [Reddy 93, Chaouiya 05]
• Process algebra π–calculus [Regev-Silverman-Shapiro 99-01, Nagasali et al. 00] • Bio-ambients [Regev-Panina-Silverman-Cardelli-Shapiro 03]
of concentrations and their derivatives at discrete time points
François Fages Les Houches, avril 2007
2. Stochastic Semantics
Associates to each molecule its number |Ai| in its location
François Fages Les Houches, avril 2007
2. Stochastic Semantics
Associates to each molecule its number |Ai| in its location
Compiles the rule set into a continuous time Markov chain
over vector states (|A1|,…, |Ak|) where the transition rate τi for the reaction ei for Si=>S’I (giving probability after normalization) is
François Fages Les Houches, avril 2007
2. Stochastic Semantics
Associates to each molecule its number |Ai| in its location
Compiles the rule set into a continuous time Markov chain
over vector states (|A1|,…, |Ak|) where the transition rate τi for the reaction ei for Si=>S’I (giving probability after normalization) is
[Gillespie 76, Gibson 00]where Vi is the volume where the reaction occurs and K is Avogadro number
τi = ei for reactions of the form A =>...,
τi = ei /Vi×K for reactions of the form A+B=>...,
τi = 2 × ei /Vi×K for reactions of the form A+A=>...,
François Fages Les Houches, avril 2007
2. Stochastic Semantics
Associates to each molecule its number |Ai| in its location
Compiles the rule set into a continuous time Markov chain
over vector states (|A1|,…, |Ak|) where the transition rate τi for the reaction ei for Si=>S’I (giving probability after normalization) is
[Gillespie 76, Gibson 00]where Vi is the volume where the reaction occurs and K is Avogadro number
τi = ei for reactions of the form A =>...,
τi = ei /Vi×K for reactions of the form A+B=>...,
τi = 2 × ei /Vi×K for reactions of the form A+A=>...,
Computes realizations as time series (t0, X0), (t1, X1), …, (tn, Xn), …
François Fages Les Houches, avril 2007
2. Stochastic Semantics
Associates to each molecule its number |Ai| in its location
Compiles the rule set into a continuous time Markov chain
over vector states (|A1|,…, |Ak|) where the transition rate τi for the reaction ei for Si=>S’I (giving probability after normalization) is
[Gillespie 76, Gibson 00]where Vi is the volume where the reaction occurs and K is Avogadro number
τi = ei for reactions of the form A =>...,
τi = ei /Vi×K for reactions of the form A+B=>...,
τi = 2 × ei /Vi×K for reactions of the form A+A=>...,
The differential semantics is an abstraction of the stochastic one [Gillespie 76]
François Fages Les Houches, avril 2007
3. Boolean Semantics
Associates to each molecule a Boolean denoting its presence/absence in its location
François Fages Les Houches, avril 2007
3. Boolean Semantics
Associates to each molecule a Boolean denoting its presence/absence in its location
Compiles the rule set into an asynchronous transition system
François Fages Les Houches, avril 2007
3. Boolean Semantics
Associates to each molecule a Boolean denoting its presence/absence in its location
Compiles the rule set into an asynchronous transition system where a reaction like A+B=>C+D is translated into 4 transition rules taking into account the possible complete consumption of reactants:
A+BA+B+C+D
A+BA+B +C+D
A+BA+B+C+D
A+BA+B+C+D
François Fages Les Houches, avril 2007
3. Boolean Semantics
Associates to each molecule a Boolean denoting its presence/absence in its location
Compiles the rule set into an asynchronous transition system where a reaction like A+B=>C+D is translated into 4 transition rules taking into account the possible complete consumption of reactants:
A+BA+B+C+D
A+BA+B +C+D
A+BA+B+C+D
A+BA+B+C+D
Necessary to over-approximate the possible behaviors under :
1. the stochastic semantics : trivial abstraction N {zero, non-zero}
2. the differential semantics : harder to relate mathematically
• Activation: transcription factors bind to the regulatory region of the gene
#E2 + E2F13-DP12 <=> #E2-E2F13-DP12
François Fages Les Houches, avril 2007
Transcription: DNA pre-mRNA mRNA Protein
• Activation: transcription factors bind to the regulatory region of the gene
#E2 + E2F13-DP12 <=> #E2-E2F13-DP12
• Transcription: RNA polymerase copies the DNA from start to stop positions into a single stranded pre-mature messenger RNA _
_ =[#E2-E2F13-DP12]=> pRNAcycA
François Fages Les Houches, avril 2007
Transcription: DNA pre-mRNA mRNA Protein
• Activation: transcription factors bind to the regulatory region of the gene
#E2 + E2F13-DP12 <=> #E2-E2F13-DP12
• Transcription: RNA polymerase copies the DNA from start to stop positions into a single stranded pre-mature messenger RNA _
_ =[#E2-E2F13-DP12]=> pRNAcycA
• (Alternative) splicing: non coding regions of pRNA are removed giving mature messenger mRNA pRNAcycA => mRNAcycA
François Fages Les Houches, avril 2007
Transcription: DNA pre-mRNA mRNA Protein
• Activation: transcription factors bind to the regulatory region of the gene
#E2 + E2F13-DP12 <=> #E2-E2F13-DP12
• Transcription: RNA polymerase copies the DNA from start to stop positions into a single stranded pre-mature messenger RNA _
_ =[#E2-E2F13-DP12]=> pRNAcycA
• (Alternative) splicing: non coding regions of pRNA are removed giving mature messenger mRNA pRNAcycA => mRNAcycA
• Transport: mRNA moves from the nucleus to the cytoplasm mRNAcycA => mRNAcycA::c
François Fages Les Houches, avril 2007
Transcription: DNA pre-mRNA mRNA Protein
• Activation: transcription factors bind to the regulatory region of the gene
#E2 + E2F13-DP12 <=> #E2-E2F13-DP12
• Transcription: RNA polymerase copies the DNA from start to stop positions into a single stranded pre-mature messenger RNA _
_ =[#E2-E2F13-DP12]=> pRNAcycA
• (Alternative) splicing: non coding regions of pRNA are removed giving mature messenger mRNA pRNAcycA => mRNAcycA
• Transport: mRNA moves from the nucleus to the cytoplasm mRNAcycA => mRNAcycA::c
• Translation: mRNA binds to a ribosome and synthesizes its protein mRNAcycA::c + Ribosome::c <=> mRNAcycA-Ribosome::c
_ =[mRNAcycA-Ribosome::c]=> cycA::c
François Fages Les Houches, avril 2007
Numerical Simulation with Default Mass Action Law
François Fages Les Houches, avril 2007
Boolean simulation (random) of transcription
François Fages Les Houches, avril 2007
Cell Differentiation by Delta-Notch Signaling
Xenopus embryonic skin [Ghosh, Tomlin 2001]
At the steady state, a cell has either the Delta phenotype or the Notch
François Fages Les Houches, avril 2007
Lateral Inhibition through Delta-Notch Signaling
Delta production is triggered by low Notch concentration in the same cell
if [N::c1]<0.5 then 1,[D::c1] for _<=>D::c1.
François Fages Les Houches, avril 2007
Lateral Inhibition through Delta-Notch Signaling
Delta production is triggered by low Notch concentration in the same cell
if [N::c1]<0.5 then 1,[D::c1] for _<=>D::c1.
Notch production is triggered by high Delta levels in neigboring cells
if [D::c21]+[D::c23]+[D::c12]+[D::c32]>0.2 then
1,[N::c22] for _ <=> N::c22.
François Fages Les Houches, avril 2007
Lateral Inhibition through Delta-Notch Signaling
Delta production is triggered by low Notch concentration in the same cell
if [N::c1]<0.5 then 1,[D::c1] for _<=>D::c1.
Notch production is triggered by high Delta levels in neigboring cells
if [D::c21]+[D::c23]+[D::c12]+[D::c32]>0.2 then
1,[N::c22] for _ <=> N::c22.
François Fages Les Houches, avril 2007
Overview of the Lecture
1. Rule-based Language for Modeling Biochemical Systems 1. Syntax of molecules, compartments and reactions2. Semantics at three abstraction levels: boolean, differential,
stochastic
3. Examples of signal transduction, transcription, cell-cell interaction
2. Temporal Logic Language for Formalizing Biological Properties1. CTL for the boolean semantics2. Constraint LTL for the differential semantics3. PCTL for the stochastic semantics
3. Automated Reasoning Tools1. Inferring kinetic parameter values from Constraint-LTL
specification2. Inferring reaction rules from CTL specification
3. Type inference by abstract interpretation
François Fages Les Houches, avril 2007
A Logical Paradigm for Systems Biology
Biological model = Transition System
Biological property = Temporal Logic Formula
Biological validation = Model-checking
Express properties in:
• Computation Tree Logic CTL for the boolean semantics
• Linear Time Logic with numerical constraints for the concentration semantics
• Probabilistic CTL with numerical constraints for the stochastic semantics
François Fages Les Houches, avril 2007
2.1 Computation Tree Logic CTL
Extension of propositional (or first-order) logic with operators for time and choices [Clarke et al. 99]
Time
Non-determinism E, A
F,G,U EF
EU
AG
Choice
Time
E
exists
A
always
X
next time
EX(f)
¬ AX(¬ f)
AX(f)
F
finally
EF(f)
¬ AG(¬ f)
AF(f)
G
globally
EG(f)
¬ AF(¬ f)
AG(f)
U
untilE (f1 U f2) A (f1 U f2)
François Fages Les Houches, avril 2007
Biological Properties formalized in CTL (1/3)
About reachability:
• Can the cell produce some protein P? reachable(P)==EF(P)
François Fages Les Houches, avril 2007
Biological Properties formalized in CTL (1/3)
About reachability:
• Can the cell produce some protein P? reachable(P)==EF(P)
• Can the cell produce P, Q and not R? reachable(P^Q^R)
François Fages Les Houches, avril 2007
Biological Properties formalized in CTL (1/3)
About reachability:
• Can the cell produce some protein P? reachable(P)==EF(P)
• Can the cell produce P, Q and not R? reachable(P^Q^R)
• Can the cell always produce P? AG(reachable(P))
François Fages Les Houches, avril 2007
Biological Properties formalized in CTL (1/3)
About reachability:
• Can the cell produce some protein P? reachable(P)==EF(P)
• Can the cell produce P, Q and not R? reachable(P^Q^R)
• Can the cell always produce P? AG(reachable(P))
About pathways:
• Can the cell reach a (partially described) set of states s while passing by another set of states s2? EF(s2^EFs)
François Fages Les Houches, avril 2007
Biological Properties formalized in CTL (1/3)
About reachability:
• Can the cell produce some protein P? reachable(P)==EF(P)
• Can the cell produce P, Q and not R? reachable(P^Q^R)
• Can the cell always produce P? AG(reachable(P))
About pathways:
• Can the cell reach a (partially described) set of states s while passing by another set of states s2? EF(s2^EFs)
• Is it possible to produce P without Q? E(Q U P)
François Fages Les Houches, avril 2007
Biological Properties formalized in CTL (1/3)
About reachability:
• Can the cell produce some protein P? reachable(P)==EF(P)
• Can the cell produce P, Q and not R? reachable(P^Q^R)
• Can the cell always produce P? AG(reachable(P))
About pathways:
• Can the cell reach a (partially described) set of states s while passing by another set of states s2? EF(s2^EFs)
• Is it possible to produce P without Q? E(Q U P)• Is (set of) state s2 a necessary checkpoint for reaching (set of) state s?
checkpoint(s2,s)== E(s2U s)
François Fages Les Houches, avril 2007
Biological Properties formalized in CTL (1/3)
About reachability:
• Can the cell produce some protein P? reachable(P)==EF(P)
• Can the cell produce P, Q and not R? reachable(P^Q^R)
• Can the cell always produce P? AG(reachable(P))
About pathways:
• Can the cell reach a (partially described) set of states s while passing by another set of states s2? EF(s2^EFs)
• Is it possible to produce P without Q? E(Q U P)• Is (set of) state s2 a necessary checkpoint for reaching (set of) state s?
checkpoint(s2,s)== E(s2U s)
• Is s2 always a checkpoint for s? AG(s -> checkpoint(s2,s))
François Fages Les Houches, avril 2007
Biological Properties formalized in CTL (2/3)
About stationarity:
• Is a (set of) state s a stable state? stable(s)== AG(s)
François Fages Les Houches, avril 2007
Biological Properties formalized in CTL (2/3)
About stationarity:
• Is a (set of) state s a stable state? stable(s)== AG(s)
• Is s a steady state (with possibility of escaping) ? steady(s)==EG(s)
François Fages Les Houches, avril 2007
Biological Properties formalized in CTL (2/3)
About stationarity:
• Is a (set of) state s a stable state? stable(s)== AG(s)
• Is s a steady state (with possibility of escaping) ? steady(s)==EG(s)
• Can the cell reach a stable state s? EF(stable(s)) not in LTL
François Fages Les Houches, avril 2007
Biological Properties formalized in CTL (2/3)
About stationarity:
• Is a (set of) state s a stable state? stable(s)== AG(s)
• Is s a steady state (with possibility of escaping) ? steady(s)==EG(s)
• Can the cell reach a stable state s? EF(stable(s)) not in LTL
• Must the cell reach a stable state s? AG(stable(s))
François Fages Les Houches, avril 2007
Biological Properties formalized in CTL (2/3)
About stationarity:
• Is a (set of) state s a stable state? stable(s)== AG(s)
• Is s a steady state (with possibility of escaping) ? steady(s)==EG(s)
• Can the cell reach a stable state s? EF(stable(s)) not in LTL
• Must the cell reach a stable state s? AG(stable(s))
• What are the stable states? Not expressible in CTL. Needs to combine CTL with search (e.g. constraint programming [Thieffry et al. 05] )
François Fages Les Houches, avril 2007
Biological Properties formalized in CTL (3/3)
About oscillations:
• Can the system exhibit a cyclic behavior w.r.t. the presence of P ? oscil(P)== EG(F P ^ F P)
CTL* formula that can be approximated in CTL by
oscil(P)== EG((P EF P) ^ (P EF P))
(necessary but not sufficient condition for oscillation)
François Fages Les Houches, avril 2007
Biological Properties formalized in CTL (3/3)
About oscillations:
• Can the system exhibit a cyclic behavior w.r.t. the presence of P ? oscil(P)== EG((P EF P) ^ (P EF P))
(necessary but not sufficient condition)
• Can the system loops between states s and s2 ?
loop(P,Q)== EG((s EF s2) ^ (s2 EF s))
François Fages Les Houches, avril 2007
Temporal Logic Querying of MAPK Signaling Pathway
MEK~{p1} is a checkpoint for the cascade, i.e. producing MAPK~{p1,p2}biocham: checkpoint(MEK~{p1} , MAPK~{p1,p2}).!E(!MEK~{p1} U MAPK~{p1,p2}) is True
Model Checking is an algorithm for computing, in a given finite Kripke structure K the set of states satisfying a CTL formula:
{sS : s |= }.
Represent K as a (finite) graph and iteratively label the nodes with the subformulas of which are true in that node.
Add to the states satisfying Add EF (EX ) to the (immediate) predecessors of states labeled by Add E( U ) to the predecessor states of while they satisfy Add EG to the states for which there exists a path leading to a non
trivial strongly connected component of the subgraph of states satisfying
Thm. CTL model checking is P-complete, model checking alg in O(|K|*||).
François Fages Les Houches, avril 2007
Symbolic Model-Checking
Still for finite Kripke structures, use boolean constraints to represent
1. sets of states as a boolean constraint c(V)
2. the transition relation as a boolean constraint r(V,V’)
Binary Decision Diagrams BDD [Bryant 85] provide canonical forms to Boolean formulas (decide Boolean equivalence, TAUT is co-NP)
biocham: check_reachable(cdk46~{p1,p2}-cycD~{p1}). Ei(EF(cdk46~{p1,p2}-cycD~{p1})) is truebiocham: check_checkpoint(cdc25C~{p1,p2}, cdk1~{p1,p3}-cycB). Ai(!(E(!(cdc25C~{p1,p2}) U cdk1~{p1,p3}-cycB))) is truebiocham: nusmv(Ai(AG(!(cdk1~{p1,p2,p3}-cycB) -> checkpoint(Wee1, cdk1~{p1,p2,p3}-cycB))))). Ai(AG(!(cdk1~{p1,p2,p3}-cycB)->!(E(!(Wee1) U cdk1~{p1,p2,p3}-cycB)))) is falsebiocham: why.-- Loop starts here cycB-cdk1~{p1,p2,p3} is present cdk7 is present cycH is present cdk1 is present Myt1 is present cdc25C~{p1} is presentrule_114 cycB-cdk1~{p1,p2,p3}=[cdc25C~{p1}]=>cycB-cdk1~{p2,p3}. cycB-cdk1~{p2,p3} is present cycB-cdk1~{p1,p2,p3} is absentrule_74 cycB-cdk1~{p2,p3}=[Myt1]=>cycB-cdk1~{p1,p2,p3}. cycB-cdk1~{p2,p3} is absent cycB-cdk1~{p1,p2,p3} is present
François Fages Les Houches, avril 2007
Mammalian Cell Cycle Control Benchmark
500 variables, 2500 states. 800 rules.
BIOCHAM NuSMV model-checker time in sec. [Chabrier et al. TCS 04]
Initial state G2 Query: Time:
compiling 29
Reachability G1 EF CycE 2
Reachability G1 EF CycD 1.9
Reachability G1 EF PCNA-CycD 1.7
Checkpoint
for mitosis complex
EF ( Cdc25~{Nterm}
U Cdk1~{Thr161}-CycB)
2.2
Oscillation EG ( (CycA EF CycA) ( CycA EF CycA))
31.8
François Fages Les Houches, avril 2007
2.2 LTL with Constraints for the Differential Semantics
• Constraints over concentrations and derivatives as FOL formulae over the reals:
• [M] > 0.2
• [M]+[P] > [Q]
• d([M])/dt < 0
François Fages Les Houches, avril 2007
LTL with Constraints for the Differential Semantics
• Constraints over concentrations and derivatives as FOL formulae over the reals:
• [M] > 0.2
• [M]+[P] > [Q]
• d([M])/dt < 0
• Linear Time Logic LTL operators for time X, F, U, G• F([M]>0.2)
• FG([M]>0.2)
• F ([M]>2 & F (d([M])/dt<0 & F ([M]<2 & d([M])/dt>0 & F(d([M])/dt<0))))
• oscil(M,n) defined as at least n alternances of sign of the derivative
• Period(A,75)= t v F(T = t & [A] = v & d([A])/dt > 0 & X(d([A])/dt < 0)
& F(T = t + 75 & [A] = v & d([A])/dt > 0 & X(d([A])/dt < 0)))…
François Fages Les Houches, avril 2007
How to Evaluate a Constraint LTL Formula ?
• Consider the ODE’s of the concentration semantics dX/dt = f(X)
François Fages Les Houches, avril 2007
How to Evaluate a Constraint LTL Formula ?
• Consider the ODE’s of the concentration semantics dX/dt = f(X)
• Numerical integration methods produce a discretization of time (adaptive step size Runge-Kutta or Rosenbrock method for stiff syst.)
over concentrations and their derivatives at discrete time points
• Evaluate the formula on that Kripke structure with a model checking alg.
François Fages Les Houches, avril 2007
Simulation-Based Constraint LTL Model Checking
Hypothesis 1: the initial state is completely known
Hypothesis 2: the formula can be checked over a finite period of time [0,T]
1. Run the numerical integration from 0 to T producing values at a finite sequence of time points
2. Iteratively label the time points with the sub-formulae of that are true:
Add to the time points where a FOL formula is true,
Add F (X ) to the (immediate) previous time points labeled by Add U to the predecessor time points of while they satisfy (Add G to the states satisfying until T)
Model checker and numerical integration methods implemented in Prolog
François Fages Les Houches, avril 2007
2.3 PCTL Model Checker for the Stochastic Semantics
Compute the probability of realisation of a TL formula (with constraints) by Monte Carlo method
Perform several stochastic simulations
Evaluate the probability of realization of the TL formula
Costly…
PRISM [Kwiatkowska et al. 04] : PCTL model checker based on BDDs or Monte Carlo method.
François Fages Les Houches, avril 2007
Overview of the Lecture
1. Rule-based Language for Modeling Biochemical Systems 1. Syntax of molecules, compartments and reactions
2. Semantics at three abstraction levels: boolean, differential, stochastic
3. Examples of signal tra nsduction, transcription, cell-cell interaction
2. Temporal Logic Language for Formalizing Biological Properties1. CTL for the boolean semantics
2. Constraint LTL for the differential semantics
3. PCTL for the stochastic semantics
3. Automated Reasoning Tools1. Inferring kinetic parameter values from Constraint-LTL
specification
2. Inferring reaction rules from CTL specification
3. Type inference by abstract interpretation
François Fages Les Houches, avril 2007
Example: Cell Cycle Control Model [Tyson 91]
MA(k1) for _ => Cyclin.
MA(k2) for Cyclin => _.
MA(K7) for Cyclin~{p1} => _.
MA(k8) for Cdc2 => Cdc2~{p1}.
MA(k9) for Cdc2~{p1} =>Cdc2.
MA(k3) for Cyclin+Cdc2~{p1} => Cdc2~{p1}-Cyclin~{p1}.
MA(k4p) for Cdc2~{p1}-Cyclin~{p1} => Cdc2-Cyclin~{p1}.
k4*[Cdc2-Cyclin~{p1}]^2*[Cdc2~{p1}-Cyclin~{p1}] for