Digital Cells Foothill College Nanotechnology Image by John Alsop
Jan 27, 2015
Digital Cells
Foothill College
Nanotechnology
Image by John Alsop
Outline
• Concept of a gene (extended)• Gene Regulatory Networks (GRN)• GRN and cells as an information system• Creating molecular interaction maps• The goal and process of digital cells• Using e-cell / SBML to model a cell• Bio-nano-info convergence
Central Dogma in Biology
DNA (sequence, expression)
RNA (sequence, structure)
Protein (sequence, structure)
Protein feedback
Transcription controlTranscription
Translation
Concept of a Gene
• Why do we separate proteins from (DNA) in our definition of genes?
• One is seen as “distinct” from the other– As if they had separate lives
• Proteins are the tactical or “execution” side of a gene – the field of Proteomics
• Nucleotides are the strategic or “planning” side of a gene - Genomics
Fundamental Interactions
There are three ‘semi-distinct’ layers of process and information space inside a cell – connected through molecular networks
IonChannels
Receptors
TranscriptionFactors
Ligands
ELECTROPHYSIOLOGY
Extracellularspace
Cytoplasm
NucleusTranslation +
processing
cis sites
IntracellularSignaling
GeneticRegulatory
NetworkmRNA
The (Really) Big Picture
Proteins and Pathways
Cellular Operating System
• Genes are interchangeable parts, but must be ‘tuned’ for synchronization, collaboration, workflow, messaging, etc.
• They are ‘metabolic.dlls’ – part of a cellular operating system. They are the most basal autonomous code in the cellular OS.
• Protein services must also ‘boot’ with the OS, and regulate how OS interacts with the metabolome, and other signaling proteins.
Genomic Decision Networks
Simplified version of the phage decision network that determines whether an infected E. coli cell follows the lytic or lysogenic pathway. Dashed arrows indicate the direction of transcription, and bold arrows indicate regulatory interactions between a gene product and particular DNA region.
Oscillating Networks
• Need to think about oscillating reactions– (protein formation / life-time) inside a cell.
• Gene regulatory networks create inverters (digital inverter networks)
• Inverters create ‘joined’ oscillating reactions with a lag time– Timing from transcription to translation is
critical, as is the half-life of the protein
Strategy of Genes
• When?• Where?• How much?• Who with?• Gene circuits
– Regulatory / inhibition– Promoters– Co-expression
Mechanics of Transcription
Genes rely on several molecular signals and processes to manifest a solution, which is part of a larger decision network
Genes are just Solutions
• Successful molecular solutions involving aminoacyls required templates to execute– When, orchestrated (how), and how much – Executed in time, space, and abundance
• Genes today are complex solutions– Most “genes” code for complex proteins
• Entire genomes orchestrate a symphony– Organisms are autonomous collectives
Genetic Algorithms
Genetic Algorithms
Self-Assembled Algorithms
--------------------------- 1010110001011010ATGCCAGTACTGGTACGGTCATGACC0101001110100101---------------------------
Information vs. Processing
Just as in a computer, data bits and processing bits are made from the same material, 0 or 1, or A, T, C, G, or U in biology
Basic GRN Circuits
Gross anatomy of a minimal gene regulatory network (GRN) embedded in a regulatory network. A regulatory network can be viewed as a cellular input-output device. http://doegenomestolife.org/
http://doegenomestolife.org/
Gene regulatory networks ‘interface’ with cellular processes
Goal of Digital Cells
• Simulate a Gene Regulatory Network– Goal of e-cell, CellML, and SBML projects
• Test microarray data for biological model– Run expression data through GRN functions
• Create biological cells with new functions– Splice in promoters to control expression– Create oscillating networks using operons
Digital Cells
• Bio-logic gates
• Inverters, oscillators
• Creating genomic circuitry
• Promoters, operons and genes
• Multi-genic oscillating solutions
Digital Cells
http://www.ee.princeton.edu/people/Weiss.php
Digital Cell Circuit (1)
INVERSE LOGIC. A digital inverter that consists of a gene encoding the instructions for protein B and containing a region (P) to which protein A binds. When A is absent (left)—a situation representing the input bit 0—the gene is active. and B is formed—corresponding to an output bit 1. When A is produced (right)—making the input bit 1—it binds to P and blocks the action of the gene—preventing B from being formed and making the output bit 0. Weiss http://www.ee.princeton.edu/people/Weiss.php
Digital Cell Circuit (2)
In this biological AND gate, the input proteins X and Y bind to and deactivate different copies of the gene that encodes protein R. This protein, in turn, deactivates the gene for protein Z, the output protein. If X and Y are both present, making both input bits 1, then R is not built but Z is, making the output bit 1. In the absence of X or Y or both, at least one of the genes on the left actively builds R, which goes on to block the construction of Z, making the output bit 0. Weiss
http://www.ee.princeton.edu/people/Weiss.php
Gene Regulatory Network
Goals of Network ModellingMolecular interaction
networks
Molecular interaction networks
Molecular interaction networks
• Representation
• Analysis
• Communication
Different Network Types
• Gene regulation networks (gene networks)– Describing transcriptional relationchips
• Biochemical networks– Describing interaction between proteins, enzymes
and other participants in cellular functions– e.g. cell cycel regulation and signal transduction
• Metabolic networks– Describing interactions of metabolites
Advantages of Graphical Representation
• Graphical representation of biochemical networks is two dimensional
• Therefore greater flexibility in describing biochemical networks than in verbal description– e.g. imagine, describing a street-map
ERKERK
RasPDK-1
ERK
ERK
ERK
RSK
RSK
RSK
RSK RSK
CREB
c-Myc
c-Myc
Raf
Ras
Raf
Raf
MEK
MEK
ERK
ERK
CREB
P
P
P
P
*
*
P
P
P
P
P
P
P
P
PPP
PPP
PP
P
Process Diagram
Diagram Proposal by A.Funashi & H.Kitano
Process Diagram
• Is essentially a state transition diagram – like in engineering or software developing
• Following states can be represented:– phosphorylation– acetylation– ubiquitination– allosteric change
• Increasing need to use these diagrams to extract gene regulatory relationships to overlay with gene expression micro-array data
Notation of the Process Diagram
A
A
State transition – changes the state of modification rather than activation
Activation
Inhibition
Translocation of module
Dashes line indicates active state of a molecule
Specific state of molecular species
Gene Regulatory Networks
• Post transcriptional interactions should be invisible
• Only gene regulatory network shall be extracted
activation or inhibition (instead of state transition
& indicates ‘AND’ - relationship
Molecular Interaction Maps (M.Aladjem, K.Kohn)
• Features:– MIM depict biochemical components of bioregulatory networks
in a standard graphical notation (like “wiring diagrams” in electronics)
– More detailed and explicit than commonly used graphical representations
– Unambiguous– Ability to view all interactions a molecule can be involved– Depicts competing interactions as well– Ready access to annotations– Retrieval of further information from external resources– Represents consequences of interactions (e.g. enzyme modifies
another enzyme)• Allows tracing of pathways within the network• Increases the utility of MIMs as aids to computer simulation
Molecular Interaction Maps (MIM)
• Characteristics:– Each molecule shown only in one location
• All interactions and modifications can be traced from one point
• Molecules can be located from an index of map coordinates
– In “Cell Cycle eMIMs” (interactive MIMs) molecules serve as links to additional sources of information (PubMed, Gene Cards, MedMiner)
A B
A B
C
Ph’tase
A
A
X
Y
Protein A and B can bind to each otherThe node represents the A:B complex
Multimolecular complex: x is A:B; y is (A:B):CEndless extendable
Reactions:
P
P
A B
Covalent modification of protein A. A can exist in a phosphorylated state.
Cleavage of a covalent bond: dephosphorylation of A by a phosphatase.
Stoichiometric conversion of A to B.
Symbols / Conventions used in eMIMs
Symbols / Conventions used in eMIMs
A
A
Reactions:
Cytosol Nucleus
Contingencies:
Transport of A from cytosol to nucleus. The dot represents A after transport to the nucleus.
Formation of homodimer. Dot on the right represents copy of A. Dot on line represents the homodimer A:A
Enzymatic stimulation of a reaction
Enzymatic of a reaction in trans.
Stimulation of a process. Bar indicates necessity.
Inhibition
Transcriptional activation
Transcriptional inhibition
Molecular Interaction Map (eMIM)
KEGG
• KEGG – Kyoto Encyclopedia of Genes and Genomes• From a SWISS-PROT entry find the EC number for
COMT (EC: 2.1.1.6 - but this doesn’t link into KEGG)• Search H.sapiens database using DBGET (KEGG)
• Catechol O-methyltransferase, membrane-bound form (EC 2.1.1.6) (MB-COMT)
• Metabolism; Amino Acid Metabolism; Tyrosine metabolism [PATH:hsa00350]
• In the pathway maps (see next slide) click on the EC number or the substrate image for details.
Pathway Diagram in KEGG
Reliable Microarray
Measurements
PredictiveModels
Model Validation
Experiments
HypothesisBiology
Engineering
Delaware Biotech Institute
Microarrays And Models
Pathway Kinetics
BioSPICE – Open Source
http://biospice.lbl.gov/
BioCyc
• BioCyc Knowledge Library• The EcoCyc and MetaCyc databases are
highly curated databases whose content is derived principally from the biomedical literature
• PathoLogic - Computationally-Derived BioCyc Databases– The majority of databases in the BioCyc collection
were created by a program called PathoLogic
E-Cell
• E-Cell System is an object-oriented software suite for modeling, simulation, and analysis of large scale complex systems such as biological cells. The version 3 allows many components driven by multiple algorithms with different timescales to coexist
CellML
CellML.org The CellMLTM language is an XML-based markup language being developed by Physiome Sciences Inc. in Princeton, New Jersey, in conjunction with the Bioengineering Institute at the University of Auckland and affiliated research groups.
The purpose of CellML is to store and exchange computer-based biological models. CellML allows scientists to share models even if they are using different model-building software. It also enables them to reuse components from one model in another, thus accelerating model building.
CellML<model name="bi_egf_pathway_1999" cmeta:id="bi_egf_pathway_1999" xmlns="http://www.cellml.org/cellml/1.0#" xmlns:cellml="http://www.cellml.org/cellml/1.0#" xmlns:cmeta="http://www.cellml.org/metadata/1.0#" xmlns:mathml="http://www.w3.org/1998/Math/MathML">
<rdf:Description rdf:about=""><!-- The Human Readable Name metadata. --><dc:title>Epidermal growth factor stimulation of mitogen-associated protein kinase and activation of Ras</dc:title>
SBML
• Is one effort for machine readable representation of “MIN”
• SBML is an XML based modelling language that represents biochemical networks
• It enables exchange of biochemical network models between software-apps (e.g. CellDesigner)
http://sbml.org
Bio-Nano-Info
• Looking at bio through the eyes of nano– Physical properties of small systems
• Looking at nano through the eyes of bio– Self-assembly of nano-structures
• Interaction of information and molecules– Molecular assemblies as information and
operating systems - nano execution of IT
• The universe’s nanoscale properties affect the processing of three attributes – Energy– Mass– Information
• Biology leverages these to produce a cellular operating system, metabolism, and complex self-assembled structures
Three Dimensions of Nano
Self Assembly
• Follows statistical thermodynamics
• Seen in molecular monolayers
• Building process for viral caspids
• Use nature to guide manufacturing– Control and guide novel structures
Molecular Self Assembly
Figure1: 3D diagram of a lipid bilayer membrane - water molecules not represented for clarity
http://www.shu.ac.uk/schools/research/mri/model/micelles/micelles.htm
Figure 2: Different lipid model -top : multi-particles lipid molecule-bottom: single-particle lipid molecule
Viral Self-Assembly
http://www.virology.net/Big_Virology/BVunassignplant.html
Bio-Nano Convergence
Summary
• Cell as an information system
• Genome as a decision network
• Pathways and process diagrams
• Digital cells - insilico biology
• Bio-nano-info convergence– Biology as an ‘instance’ of nanotechnology– Nature as an information (processing) system
References
• http://www.ee.princeton.edu/people/Weiss.php• http://www.dbi.udel.edu/ • http://biospice.lbl.gov/ • http://www.systems-biology.org/ • http://www.e-cell.org/• http://sbml.org/ • http://biocyc.org/• http://www.sbi.uni-rostock.de/teaching/research/ • http://www.ipt.arc.nasa.gov/