François Fages MPRI Bio-info 2007 Formal Biology of the Cell Modeling, Computing and Reasoning with Constraints François Fages, Constraint Programming Group, INRIA Rocquencourt mailto:[email protected]http://contraintes.inria.fr/ Transpose concepts and tools from programming theory to systems biology • Formal Methods of Program Verification to Systems Biology, • Constraint Logic Programming and Constraint-based Model Checking In course, • Learn bits of cell biology through computational models, • Develop new formalisms, languages and algorithms coming from biological questions
40
Embed
Transpose concepts and tools from programming theory to systems biology
Formal Biology of the Cell Modeling, Computing and Reasoning with Constraints François Fages, Constraint Programming Group, INRIA Rocquencourt mailto:[email protected] http://contraintes.inria.fr/. Transpose concepts and tools from programming theory to systems biology - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
François Fages MPRI Bio-info 2007
Formal Biology of the CellModeling, Computing and Reasoning with
Constraints
François Fages, Constraint Programming Group, INRIA Rocquencourt
Transpose concepts and tools from programming theory to systems biology• Formal Methods of Program Verification to Systems Biology,• Constraint Logic Programming and Constraint-based Model Checking
In course, • Learn bits of cell biology through computational models,• Develop new formalisms, languages and algorithms coming from
biological questions
François Fages MPRI Bio-info 2007
Systems Biology
•Multidisciplinary field aiming at getting over the complexity walls to reason about biological processes at the system level.
• Conferences ICSB, CMSB, … journal TCSB, …
•Virtual cell: emulate high-level biological processes in terms of their biochemical basis at the molecular level (in silico experiments)
•Bioinformatics: end 90’s, genomic sequences post-genomic data (RNA expression, protein synthesis, protein-protein interactions,… )
•Need for a strong effort on:
- the formal representation of biological processes,
- formal tools for modeling and reasoning about their global behavior.
François Fages MPRI Bio-info 2007
Language Approach to Cell Systems Biology
Qualitative models: from diagrammatic notation to• Boolean networks [Thomas 73]
Quantitative models: from differential equation systems to• Hybrid Petri nets [Hofestadt-Thelen 98, Matsuno et al. 00]
• Hybrid automata [Alur et al. 01, Ghosh-Tomlin 01]
• Hybrid concurrent constraint languages [Bockmayr-Courtois 01]
• Rules with continuous dynamics BIOCHAM-2 [Chabrier-Fages-Soliman 04]
François Fages MPRI Bio-info 2007
The Biochemical Abstract Machine BIOCHAM
Software environment based on two formal languages:
1. Biocham Rule Language for Modeling Biochemical Systems 1. Syntax of molecules, compartments and reactions2. Semantics at 3 abstraction levels: Boolean, Concentrations,
Populations
2. Biocham Temporal Logic for Formalizing Biological Properties1. CTL for Boolean semantics2. Constraint LTL for concentration semantics, PCTL for stochastic
semantics
Machine learning Rules and Parameters from Temporal Properties1. Learning reaction rules from CTL specification2. Learning kinetic parameter values from Constraint-LTL specification
Internship topics: http://contraintes.inria.fr
François Fages MPRI Bio-info 2007
Overview of the Lectures
1. Formal molecules and reaction rules in BIOCHAM.
2. Formal biological properties in temporal logic. Symbolic model-checking.
3. Continuous dynamics. Kinetics and transport models.
4. Computational models of the cell cycle control.
5. Abstract interpretation and typing of biochemical networks
6. Machine learning reaction rules from temporal properties.
7. Constraint-based model checking. Learning kinetic parameter values.
8. Constraint Logic Programming approach to protein structure prediction.
Stability and bindings determined by the number of weak bonds: 3D shape
• 20% proteins (50-104 amino acids)
• RNA (102-104 nucleotides AGCU)
• DNA (102-106 nucleotides AGCT)
François Fages MPRI Bio-info 2007
Structure Levels of Proteins
1) Primary structure: word of n amino acids residues (20n possibilities)
linked with C-N bonds
Example: MPRI
Methionine-Proline-Arginine-Isoleucine
2) Secondary: word of m helix, strands, random coils,… (3m-10m)
stabilized by hydrogen bonds H---O
3) Tertiary 3D structure: spatial folding
stabilized by
hydrophobic
interactions
François Fages MPRI Bio-info 2007
Formal proteins
Cyclin dependent kinase 1 Cdk1
(free, inactive)
Complex Cdk1-Cyclin B Cdk1–CycB
(low activity)
Phosphorylated form Cdk1~{thr161}-CycB
at site threonine 161
(high activity)
BIOCHAM syntax
François Fages MPRI Bio-info 2007
Deoxyribonucleic Acid DNA
1) Primary structure: word over 4 nucleotides
Adenine, Guanine, Cytosine, Thymine
2) Secondary structure:
double helix of pairs
A--T and C---G stabilized
by hydrogen bonds
François Fages MPRI Bio-info 2007
DNA: Genome Size
Species Genome size Chromosomes Coding DNA
E. Coli (bacteria) 5 Mb 1 circular 100 %
S. Cerevisae (yeast) 12 Mb 16 70 %
… 3 Gb
… 15 Gb
… 140 Gb
François Fages MPRI Bio-info 2007
DNA: Genome Size
Species Genome size Chromosomes Coding DNA
E. Coli (bacteria) 5 Mb 1 circular 100 %
S. Cerevisae (yeast) 12 Mb 16 70 %
Mouse, Human 3 Gb 20, 23 15 %
… 15 Gb
… 140 Gb
3,200,000,000 pairs of nucleotides
single nucleotide polymorphism 1 / 2kb
François Fages MPRI Bio-info 2007
Genome Size
Species Genome size Chromosomes Coding DNA
E. Coli (bacteria) 4 Mb 1 100 %
S. Cerevisae (yeast) 12 Mb 16 70 %
Mouse, Human 3 Gb 20, 23 15 %
Onion 15 Gb 8 1 %
… 140 Gb
François Fages MPRI Bio-info 2007
Genome Size
Species Genome size Chromosomes Coding DNA
E. Coli (bacteria) 4 Mb 1 100 %
S. Cerevisae (yeast) 12 Mb 16 70 %
Mouse, Human 3 Gb 20, 23 15 %
Onion 15 Gb 8 1 %
Lungfish 140 Gb 0.7 %
François Fages MPRI Bio-info 2007
DNA Replication
Separation of the two helices and
production of one complementary strand for each copy
(from one or several starting points of replication)
François Fages MPRI Bio-info 2007
Syntax of Genes
Part of DNA, unique #E2
Activation #E2-E2f13-DP12
binding of promotion factor
Repression
binding of another molecule
François Fages MPRI Bio-info 2007
Transcription: DNA gene pRNA mRNA Protein
Genes: parts of DNA 1. Activation (Inhibition): transcription factors (inhibitors) bind to the
regulatory region of the gene #E2 + E2F13-DP12 => #E2-E2F13-DP122. Transcription: RNA polymerase copies the DNA from start to stop
positions into a single stranded pre-mature messenger pRNA _=[#E2-E2F13-DP12]=> pRNAcycA
3. (Alternative) splicing: non coding regions of pRNA are removed giving mature messenger mRNA pRNAcycA => mRNAcycA
4. Protein synthesis: mRNA moves to cytoplasm and binds to ribosome to assemble a protein mRNAcycA => mRNAcycA::cyt mRNAcycA::cyt + ribosome::cyt => cycA::cyt