-
Understanding the genotypephenotype relationship is at the core
of the life sciences. For the latter half of the twentieth century,
the reductionist approaches of genetics, biochemistry and molecular
biology focused on the elucidation of biological components that
under-lie this fundamental relationship. These approaches have
provided detailed understanding of individual compo-nents, but they
do not address the systemic inter actions of biological and
environmental components that underlie phenotypes. Technological
advances have now enabled high-throughput methods to
comprehensively characterize biological components simultaneously.
The cost of such data generation has decreased exponentially and
the amount of data generated has become more abundant, which
enables biologists to view and study cells as systems of
interacting components.
To cope with the rapidly growing number of high-dimensional data
sets, sophisticated data analysis methods are needed. Diverse
approaches that range from stochastic kinetic models to statistical
Bayesian networks have been applied, and each of these approaches
has differ-ing rationales and advantages (TABLE1). One of these
approaches is constraint-based reconstruction and analysis that is
applied to genome-scale metabolic net-works. Reconstructed
genome-scale metabolic networks contain curated and systematized
information about the known small metabolites and metabolic
reactions of a cell type, which is based on its annotated genome
and on experimental literature1,2. Genome-scale metabolic
networks can be converted to a mathematically con-sistent
format, which is known as the stoichiometric matrix (BOX1). This
matrix is the central component of a constraint-based model (CBM),
which can be queried by an ever-growing set of modelling methods3
(BOX2). CBMs have been primarily built for metabolic networks,
including multicellular metabolic interactions48. CBMs have also
been built for signalling9,10, transcriptional regulation11 and
macromolecule synthesis12.
This Review illustrates how CBMs have recently provided the
foundation for formulating genome-scale mechanistic predictions of
metabolic physiology that are now being used in a prospective
manner to eluci-date new biological knowledge and understanding. We
begin with a brief description of the four-phase history of the
development of CBM applications. We then pre-sent studies that show
the recent progress of integrat-ing high-throughput data sets with
the mechanistic and functional context of CBMs to predict metabolic
pheno-types, and we emphasize the implemented workflows,
limitations of the approach and opportunities for further
development.
Foundational developmentsConstraint-based analysis has been
applied to bio-chemical reaction networks for more than 25years. To
put these developments into context, we exhaustively searched the
literature using Web of Knowledge to collect research articles that
use CBMs for interpreting
Department of Bioengineering, University of California, San
Diego, 9500 Gilman Dr, La Jolla, California 920930412,
USA.Correspondence to B.O.P.email:
[email protected]:10.1038/nrg3643Published online 16 January
2014
Constraint-based models predict metabolic and associated
cellular functionsAarash Bordbar, Jonathan M.Monk, Zachary A.King
and Bernhard O.Palsson
Abstract | The prediction of cellular function from a genotype
is a fundamental goal in biology. For metabolism, constraint-based
modelling methods systematize biochemical, genetic and genomic
knowledge into a mathematical framework that enables a mechanistic
description of metabolic physiology. The use of constraint-based
approaches has evolved over ~30years, and an increasing number of
studies have recently combined models with high-throughput data
sets for prospective experimentation. These studies have led to
validation of increasingly important and relevant biological
predictions. As reviewed here, these recent successes have tangible
implications in the fields of microbial evolution, interaction
networks, genetic engineering and drug discovery.
Nature Reviews Genetics | AOP, published online 16 January 2014;
doi:10.1038/nrg3643 R E V I E W S
NATURE REVIEWS | GENETICS ADVANCE ONLINE PUBLICATION | 1
2014 Macmillan Publishers Limited. All rights reserved
mailto:[email protected]
-
Metabolite overflowsBiological phenomena whereby the rate of
substrate use by a cell for growth is lower than the rates of
uptake and conversion of the substrate, which results in production
of side metabolites (for example, acetate in Escherichia coli).
Metabolic fluxesThe rates of turnover or movement of metabolites
through a reaction or a pathway.
Objective functionsThe particular variables, or metabolic
reactions, that are being maximized or minimized for by the linear
programme. In flux-balance analysis, the objective function is
often a pseudoreaction for biomass generation that represents
cellular growth.
and predicting biological phenotypes. We collected 645 articles
that were published from 1986 to 15June 2013. The articles are
available with short descriptions of their contributions to the
research field at the literature on model-driven analysis website
hosted by the University of California, San Diego Systems Biology
Research group. An analysis of this literature shows that the
history of CBMs can be divided into fourphases.
Initial studies (19861998). CBMs were initially used to
determine theoretical pathway yields and metabolite overflows13,14.
Experimental metabolic fluxes and growth rates were shown to be
consistent with fluxes that were computed on the basis of
optimization of cellular objective functions, including minimal
production of reactive oxy-gen species (ROS) for hybridoma cells15
and maximal growth rate for laboratory strains of Escherichia coli
16. Concurrently, algorithms such as Elementary Flux Modes17 and
Extreme Pathways18 were developed19 to exhaustively calculate
metabolic pathways in CBMs for analysis of network topology20 and
for uses in metabolic engineering21. The quantitative match between
CBM predictions and measured cellular behaviour opened up the
possibility of predicting phenotypes from a biochemically
reconstructed network.
Building genome-scale networks (19992004). The ability to
sequence whole genomes22 made it possible to formulate CBMs at the
genome scale and allowed representation of the complete metabolic
gene content in the assessment of phenotypic functions23.
Importantly, metabolic reactions in a CBM could be directly linked
to the genotype of the target cell, which allowed predic-tion of
the consequences of gene knockouts24,25. These genome-scale models
facilitated the study of the global organization of cellular
behaviour, such as pathway structure26, adaptive evolution end
points27, metabolic fluxes28 and bacterial evolution29,30.
Integrating omic data (20052009). As the generation of omic data
became cheaper and as larger data sets appeared, researchers began
to incorporate these data sets into CBMs31 (FIG.1). Initially, the
metabolic network was used as a scaffold to interpret
transcriptional changes32,33 in a manner that is similar to pathway
enrichment analysis (FIG.1a). Subsequently, omic data were used
more directly by further constraining individual metabolic
reactions to increase the context specificity of CBMs34,35
(FIG.1b).
Maturing to predictive practice (2010present). These efforts
resulted in highly curated and validated
Table 1 | A comparison of modelling and analysis techniques for
high-throughput data
Method Model systems Parameterization Typical prediction
type
Advantages Disadvantages Refs
Stochastic kinetic modelling
Small-scale biological processes
Detailed kinetic parameters
Reaction fluxes, component concentrations and regulatory
states
MechanisticDynamicCaptures biological
stochasticity and biophysics
Computationally intensive
Difficult to parameterizeChallenging to model
multiple timescales
106
Deterministic kinetic modelling
Small-scale biological processes
Detailed kinetic parameters
Reaction fluxes, component concentrations and regulatory
states
MechanisticDynamic
Computationally intensive
Difficult to parameterize
107
Constraint-based modelling
Genome-scale metabolism
Network topology, and uptake and secretion rates
Metabolic flux states and gene essentiality
MechanisticLarge scaleNo kinetic
information is required
No inherent dynamic or regulatory predictions
No explicit representation of metabolic concentrations
3,104
Logical, Boolean or rule-based formalisms
Signalling networks and transcriptional regulatory networks
Rule-based interaction network
Global activity states and onoff states of genes
Can model dynamics and regulation
Biological systems are rarely discrete
108
Bayesian approaches
Gene regulatory networks and signalling networks
High-throughput data sets
Probability distribution score
Non-biasedCan include
disparate and even non-biological data
Takes previous associations into account
StatisticalIssues of over-fittingRequires comprehensive
training data
109, 110
Graph and interaction networks
Proteinprotein and genetic interaction networks
Interaction network that is based on biological data
Enriched clusters of genes and proteins
Incorporates prior biological data
Encompasses most cellular processes
Dynamics are not explicitly represented
111, 112
Pathway enrichment analysis
Metabolic and signalling networks
Pathway databases (for example, KEGG, Gene Ontology and
BioCyc)
Enriched pathways Simple and quickTakes prior
knowledge into account
Biased to human-defined pathways
Non-modelling approach
73
KEGG, Kyoto Encyclopedia of Genes and Genomes.
R E V I E W S
2 | ADVANCE ONLINE PUBLICATION
www.nature.com/reviews/genetics
2014 Macmillan Publishers Limited. All rights reserved
http://sbrg.ucsd.edu/cobra-predictionshttp://sbrg.ucsd.edu/cobra-predictions
-
Metabolic pathwaysIn the context of this Review, sets of
pathways that are calculated by metabolic network-based pathway
analysis tools such as Extreme Pathways and Elementary Flux
Modes.
Metabolic engineeringThe practice of improving cellular
production of target compounds of interest by modifying and
optimizing genetic, regulatory and environmental parameters of
cellular metabolism.
Genome-scale modelsThe formulation, using mathematical models,
of genome-scale metabolic network reconstructions. They are
synonymous with constraint-based models in the context of this
Review.
Pathway enrichment analysisA high-throughput data analysis
technique to understand more global changes in an experiment by
grouping individual measurements of biological components (for
example, genes and proteins) into a context that is based on
various pathway databases (for example, Kyoto Encyclopedia of Genes
and Genomes, BioCyc and Gene Ontology).
Metabolic flux analysisAn experimental approach to identify
metabolic fluxes using isotopically labelled metabolites and
computational software that reconciles experimental data with
network topology.
Flux distributionsSets of calculated flux values for all
reactions in a constraint-based model.
Pareto surfaceThe space that is formed when multiple objective
functions are modelled at once; it represents a set of optimal
solutions, in which increasing the value of one of the objectives
results in a trade-off with other objective values.
genome-scale models that are now enabling the research community
to obtain meaningful predictions of biologi-cal functions. This
Review is focused on this most recent phase in the field of CBM
development.
We first discuss the latest evaluations of the assump-tions of
constraint-based modelling. Second, we discuss the integration of
genome-scale data sets specifically, omic data and biomolecular
interaction data with CBMs. Third, we focus on how discre-pancies
between model predictions and experimental data allow targeted
experimentation that leads to bio-chemical discovery. Fourth,
translational applications of constraint-based modelling, including
metabolic engineering and drug target discovery, are discussed.
Finally, we focus on recent advances of integrating CBMs with other
modelling approaches to increase their predictivescope.
Refining objectivesThe first constraint-based method for
biological predic-tions was flux-balance analysis (FBA). Its
formulation is rooted in the hypothesis that a cell is striving to
achieve a metabolic objective (BOX2). Studies have shown that, by
optimizing the assumed cellular objectives of growth27 and energy
use36,37, one can predict metabolic fluxes in microorganisms. Other
studies have questioned the universality of the objective function
of biomass growth for predicting relevant metabolic fluxes3840.
Do cells maximize growth rate? To identify the objec-tive
function that best predicts experimental data on growing cells, one
study41 greatly expanded the ini-tial assessment of appropriate
objective functions for FBA by compiling 44 metabolic flux analysis
data sets of invivo flux distributions for E.coli, and the
researchers evaluated the ability of a reduced CBM to predict these
measurements using dozens of single and combined candidate cellular
objective functions (FIG.1c). The best representation for the
invivo fluxomic data sets was a Pareto surface that is defined by a
combination of three objectives: maximizing biomass generation,
maximiz-ing ATP generation and minimizing reaction fluxes across
the network; that is, the minimization is a proxy for the most
efficient use of the proteome42. Using flux variability analysis
(FVA) (BOX2), the authors found that there is some slack in
metabolic reaction fluxes when the cell is operating close to but
not on the Pareto surface. In fact, they also observed that the
invivo flux distributions were slightly sub-optimal. The authors
showed that this sub-optimality is most likely an evolutionary
adaptation that allows rapid adjustment to environmental
perturbations. In this study, meta-bolic flux analysis simulations
were limited to central carbon metabolism. Future studies are
therefore needed to determine whether the optimality principles
that have been derived in this study will hold for other metabolic
subsystems that are studied using different
Box 1 | Constraint-based modelling: motivation and
definition
Thefunctionalcapabilitiesofbiologicalsystemsareconstrainedbytheirgeneticsandenvironment,andbyphysico-chemicallaws.Forexample,mostnaturalenvironmentsarelimitedinnitrogenorphosphate.Inaddition,therateofphotosynthesisisafunctionoflatitudeastheincidentfluxofphotonschanges.Inthe1960s,DanielAtkinsonrealizedthatsolventcapacitywasalimitationinallcells,ascellstendtoconsistof70%waterand30%biomass100.In1973,PaulWeiszshowedthatmostintracellularprocessesoperateatratesthatareclosetothelimitsofdiffusion101.Theseandothermyriadconstraintsunderwhichcellsoperateandevolvehavebeensummarized102.Wecannowsystematicallyreconstructmetabolicandotherbiochemicalreactionnetworks(seethefigure).Metabolic
networksareanalogoustoflownetworks,inwhichmetabolites(shownascircles)flowthroughthenetworkinamannerthatissimilartoliquidsflowinginapipe.Theseflows,andthusthestateofanetwork,aresubjecttomyriadconstraints.Thenetworkcanbeconvertedintoamathematicalformatknownasthestoichiometricmatrixforcomputation.Ratherthanderivingasinglesolution,constraint-basedmodelshaveanassociatedsolution
space(shownasabox)inwhichallfeasiblephenotypicstatesexistgiventheimposedconstraints.Thisallowsonetosimultaneouslyaccountforthemanyprocessesthatactonandincells.Metaboliteflowisconstrainedby,amongotherthings,thenetworktopology(forexample,theconnectionof
metabolites)andasteady-stateassumption(forexample,theassumptionthatinternalmetabolitesmustbeproducedandconsumedinamass-balancedmanner).Itisalsoconstrainedbytheknownupperbounds(alsoknownascapacities;forexample,V
1,max)andlowerboundsofindividualreactionfluxes.Imposingsuchconstraintsshrinksthesolutionspaceto
amorebiologicallyrelevantregion.Thechallengesinconstraint-basedmodellinglieinidentifyingandimposingthenecessaryanddominantconstraintstodefineasolutionspace,aswellasinprobingthesolutionspaceinamannersuchthatphysiologicallyrelevantfluxesorphenotypesaredetermined.
Figureismodified,withpermission,fromREF.3(2012)MacmillanPublishersLtd.Allrightsreserved.Nature
Reviews | Genetics
Metabolic network Stoichiometric matrix Imposition of
constraints
V3,maxV3,max
V1,max V1,max
V2,max V2,max
Measuredflux rates
Steady-statemass balance
1 10 0 00 0 00
1 1 000 0 0 00
1 1 1 000 0 00
0 1 10 0 0 000
0 11 10 0 00 0
0 1 0 0 00001
10 0 0 000 01
Reactions
Met
abol
ites
R E V I E W S
NATURE REVIEWS | GENETICS ADVANCE ONLINE PUBLICATION | 3
2014 Macmillan Publishers Limited. All rights reserved
-
Central carbon metabolismThe metabolic pathways and reactions
that convert sugars into the metabolic precursors that are required
for growth. It is typically comprised of glycolysis, pentose
phosphate pathways and the tricarboxylic acid cycle.
Solution spaceThe range of all feasible values for variables in
a constraint-based model, which represents all potential metabolic
reaction flux distributions on the basis of the given
constraints.
labelled metabolites. Nonetheless, CBMs can use optimality
principles to predict the approximate growth state and the hedging
functions that keep the cells from fully reaching the predicted
optimal states, which are yet to be delineated. Such hedging
functions are expected to vary from strain to strain and from
organism to organism on the basis of their evolutionary
history.
Moving beyond the assumption of growth optimality. Although the
prediction of optimal growth rates has historically received much
attention, most of the recent studies that are highlighted in this
Review do not assume optimality of cellular growth. Researchers
have been increasingly adopting alternative unbiased approaches
such as Markov chain Monte Carlo (MCMC) sampling, omic data
integration and meta-bolic pathway analyses3 that are not subject
to assumptions of optimality.
Contextualizing omic dataThe constraint-based modelling
framework is ame-nable to simultaneous integration of a range of
omic data types31 (FIG.1). In particular, omic data have been used
both to constrain calculated flux distributions and as a comparison
and validation tool for model predictions. Such omic data
integration has enabled context-specific studies of the metabolism
of an organ-ism and, in the following cases, the studies of enzyme
promiscuity and pathogenesis.
Why are some enzymes specific and some promiscuous? It is
thought that ancestral enzymes were promiscuous and inefficient,
and that they have evolved to become catalytically efficient and
specific43. However, it is not well understood why such evolution
took place for some enzymes but not for others. To address this
question, one study44 classified proteins in the E.coli CBM45 into
two groups (that is, specialist and generalist enzymes) on the
Nature Reviews | Genetics
a Constraining reaction bounds b Computing flux
distributions
Flux-balance analysis
Flux variability analysis
MCMC sampling
Vobj
V2 V1
Geneknockout
Omic dataintegration
Uptakerates
Secretionrates
- 0
- 0
- 0
- 0
Imposed constraint
Unconstrained
Box 2 | Constraint-based modelling: introduction to methods for
analysis
Constraint-basedmodels(CBMs)havebeenwidelydeployed;seeREF.3foranextensivedescriptionofthedevelopedmethods.However,mostofthesetechniquesarebasedontwokeycomponents:theconstraintsonthebiologicalsystemandtheanalysismethodtopredictfluxes(seethefigure).
Constraining metabolic models.
Thetoymodel(seethefigure,parta)containsmetabolites(shownascircles)thatareconvertedbyreactions(shownasarrows).Eachreactionhasarangeofpotentialfluxvalues,whichcanbeconstrained(shownassliders).TheimpositionofconstraintsdefinestheassociatedsolutionspaceoftheCBM.Mostmethodsmodifythemetabolicreactionboundsformodelparameterization.Simpleconstraintsincludefixingcellularinputandoutputrangesonthebasisofuptakeandsecretionofmetabolites56,aswellascarryingoutgeneticknockoutsbysettingtheassociatedboundsofthereactionstozero24.MoreadvancedtechniquesincludemodifyingreactionboundsonthebasisofmRNAandproteinexpressiondata,eitherbysettingtheboundstozeroforreactionsthatcorrespondtoabsenttranscriptsandproteins34,35(seethefigure,parta)orbylinearlyadjustingtheboundsonthebasisoftranscriptandproteinabundances103.
Determining flux distributions.
Aftermodelparameterization,fluxesarecalculated.Asimplifiedsolutionspaceisdepictedwithtworeactions(V
1andV
2)andanobjectivefunction(V
obj)(seethefigure,partb).Thestandardapproachof
flux-balanceanalysis104eithermaximizesorminimizesthefluxofauser-definedreaction(thatis,theobjectivefunction)usinglinearprogramming(shownbythegreencircle).Anothercommonapproachisfluxvariabilityanalysis105,inwhichthemaximumandminimumfluxesthrougheachreactionareiterativelycomputedwhenthefluxoftheobjectivefunctionistypicallyconstrainedtoitsmaximumvalue(shownbyyellowcircles).Finally,MarkovchainMonteCarlo(MCMC)samplingcomputesmanycandidatefluxdistributions(shownasreddots)thatprovideaprobabilitydistributionforthefluxes.Thisapproachisunbiased,asnoassumptionofanobjectiveisrequired.
R E V I E W S
4 | ADVANCE ONLINE PUBLICATION
www.nature.com/reviews/genetics
2014 Macmillan Publishers Limited. All rights reserved
-
basis of the number of reactions that are catalysed by each
enzyme. The authors showed that reactions that are associated with
specialist enzymes are more likely to be essential on the basis of
growth phenotypes of knockout strains, and that these reactions are
more likely to carry a high and variable flux across hundreds of
insilico envi-ronmental conditions on the basis of MCMC sampling
(BOX2). Large, disparate data sets were used to validate
simulations. Gene essentiality predictions were validated by
comparison with a gene deletion collection46. An analysis of
kinetic parameters from the Braunschweig enzyme database (BRENDA)47
showed that specialist enzymes have higher invitro catalytic
activity (that is, higher turnover number (kcat)) and higher
substrate affinity (that is, lower Michaelis constant (Km)).
Omic
data sets revealed that specialist enzymes are more tightly
regulated at multiple levels, which is indicated by transcriptional
and post-translational modifications, as well as by
small-molecule-mediated control. Although enzyme promiscuity has
not been fully elucidated and might not be fully captured in the
model, the CBM is nevertheless the best self-consistent
representation of known metabolic reactions and enzymes in E.coli.
Consequently, these predictions provided a direction to integrate
and interpret disparate data sets with the CBM, thereby validating
a genome-scale hypothesis that the evolution of an enzyme towards
specificity and catalytic efficiency is dependent both on the
function of the enzyme in its metabolic network context and on its
evolutionary response to selection pressures.
Figure 1 | The multiple uses of high-throughput data in
constraint-based models. Constraint-based modelling can be used to
interpret and augment omic data sets by using an underlying
cellular network that has been biochemically validated. Metabolites
are represented by circles. a | Similarly to pathway enrichment
analysis and interaction networks, high-throughput data can be
integrated with the metabolic network topology to determine
enriched regions and even significantly perturbed metabolites32. b
| Omic data add an additional layer of constraints for reaction
fluxes. One study48 integrated expression profiling data to
determine context-specific flux distributions (pathway shown in
red), which increases the fidelity of the data (represented as
bars) as well as the accuracy of flux predictions (upper panel). In
addition, two other studies77,78 used omic data to build cell- and
tissue-specific models of human metabolism by removing unexpressed
reactions (shown as discoloured reactions) from the global human
metabolic network (lower panel). Differences in these networks can
be exploited to learn unique features of each network. c |
Constraint-based analysis predictions can be compared and validated
against high-throughput data sets. One study41 compared
flux-balance analysis solutions of different objectives against 13C
fluxomic data to find a combination of objectives that best fit the
invivo fluxes.
High-throughputdata integration
Multi-omicdataintegration
DownregulatedLow flux
Upregulated
Downregulated
Upregulated
Downregulated
Upregulated
Enriched regionsof change
High flux
For cell- and tissue-specific model building
For context-specific flux distributions
High- throughputdataintegration
Sum
of fl
uxes
Comparing objectives tomatch 13C fluxomic data
Comparison
Nature Reviews | Genetics
b Constraining the solution spacea Topological enrichment c
Comparison
Simulated fluxes High-throughput data
ATPyield
Biomassyield
R E V I E W S
NATURE REVIEWS | GENETICS ADVANCE ONLINE PUBLICATION | 5
2014 Macmillan Publishers Limited. All rights reserved
-
Machine learning methodA method that applies statistical methods
to discover generalizable rules and patterns in complex data
sets.
What is the role of metabolism in pathogenesis? Intracellular
pathogens adapt metabolism to their host environment during
pathogenesis. One study48 gene-rated transcriptional profiling data
of pathogenic intra-cellular growth to investigate the relationship
between metabolism and pathogenesis of Listeria monocytogenes. The
researchers analysed the data both through tradi-tional pathway
enrichment analysis (TABLE 1) and through integration with a CBM of
L.monocytogenes. They used the iMAT algorithm34 that computes a
flux distribution which best uses reactions that are associ-ated
with upregulated genes and which avoids using reactions that are
associated with downregulated genes (FIG.1b), thereby predicting
differential reaction use between conditions. By comparing pathway
enrich-ment analysis of the transcription data with and without
iMAT, the authors found that iMAT increased accuracy in
representing known changes in intracellular growth because both the
CBM and the computed flux states contextualize the expression data.
In this way, incor-rectly upregulated transcripts either due to a
false-positive measurement or due to post-transcriptional
regulation are algorithmically corrected if the rest of the
associated pathway is inactive (FIG.1b) and vice versa. The higher
predictive accuracy helped the authors to focus their experiments
on highly active pathways, which were then experimentally confirmed
by generating conditional knockout strains. Prospective experiments
that were based on the identified pathways showed that limiting
concentrations of branched-chain amino acids induced virulence
activator genes and elucidated the role of amino acid metabolism in
patho-genesis. Thus, analysis of transcription data is often
hindered by the low signal-to-noise ratio and by the limitation
that post-transcriptional regulation is not captured in these data
sets. These limitations can be ameliorated for metabolic
transcripts by the con-textualization of the iMAT algorithm
andCBMs.
Characterizing interaction networksRecent work shows that CBMs
can be used to place interaction networks of diverse biological
components into context and to interpret these networks.
Interaction networks describe the phenomenological interactions
between different biomolecules, including genes49, proteins50 and
transcription factors51. Such interaction networks are information
dense and highly valuable, but they cannot generally be formulated
into a modelling framework for prediction of physiological
functions. In the following examples, the mechanistic information
in CBMs is used in conjunction with biomolecular interac-tion
networks to derive principles that underlie cellular
organization.
Genetic interaction networks. The theoretical aspects of genetic
interactions of metabolic genes in Saccharomycescerevisiae that
were derived using CBMs have been studied5254. A recent study55
used a CBM and experimental data to discover mechanistic principles
that underlie global properties of S.cerevisiae genetic interaction
data (FIG.2a). First, the authors
experimentally and computationally quantified genetic
interactions for the genes in the S.cerevisiae CBM56. Both the
experimental data and the computational pre-dictions showed a
global property that genes which are associated with low-fitness
single mutants share many genetic interactions. They then used the
CBM to pro-pose a mechanistic explanation of this phenomenon. The
researchers showed that these deleterious gene deletions directly
disrupt the production of multiple metabolite precursors that are
necessary for cellular growth. Thus, these genes share genetic
interactions with other genes that contribute to various aspects of
their functionality.
The same researchers found that FBA underpredicts genetic
interactions, which can be attributed to the optimality assumption
of FBA, to the inherent inability of FBA to capture regulation and
data on toxic interme-diates, or to an incompletely or incorrectly
annotated metabolic network. To determine whether modifying the CBM
could increase its predictive power, a machine learning method was
implemented to reconcile the two networks by removing reactions,
modifying reaction reversibility and altering the biomass function
in the CBM. Model refinement identified one of the two NAD+
biosynthetic pathways from amino acids in the CBM as a source of
inaccurate predictions. Through growth experiments using mutant
strains, the research-ers confirmed that the second biosynthetic
pathway was not present in S.cerevisiae. This study shows that CBMs
can suggest the mechanistic underpinnings of genetic interaction
networks and that the compari-son of the metabolic and genetic
interaction networks can lead to targeted improvements in
biochemical knowledge.
Transcriptional regulatory networks. CBMs have aided the
characterization of underlying principles of transcriptional
regulatory networks for E.coli metabo-lism57 (FIG.2b). Previous
studies have shown a moderate link between metabolic topology and
transcriptional regulation26,58. To provide a more detailed
analysis, one study57 calculated potential pathways through
meta-bolic subsystems of the E.coli CBM. Metabolic pathway
structure has been of great interest26 because a full enu-meration
of pathways can describe all possible steady-state metabolic
phenotypes. However, the difficulty in computing Elementary Flux
Modes for genome-scale networks has hindered their widespread use.
A recently developed alternative Elementary Flux Patterns59
calculates metabolic pathways in individual sub-systems. This
method ignores pathways that traverse multiple metabolic subsystems
but is computationally tractable for genome-scale networks. By
comparing Elementary Flux Patterns with transcriptional profil-ing
data sets60, the authors showed that pathways were only moderately
co-expressed, but the degree of such expression varied greatly from
perfect co-expression to no co-expression. They then showed that
tran-scriptional regulation of pathways is dependent on the cost of
producing the associated enzymes and on the required response time.
In pathways that contain
R E V I E W S
6 | ADVANCE ONLINE PUBLICATION
www.nature.com/reviews/genetics
2014 Macmillan Publishers Limited. All rights reserved
-
Gap-fillingPertaining to a procedure for targeted expansion of
metabolic knowledge, whereby prospective experiments are designed
on the basis of discrepancies in experimental data and model
predictions.
expensive proteins (that is, proteins that are larger in size),
transcriptional regulation typically occurs for all enzymes in the
pathway, whereas pathways with low-cost proteins are typically
transcriptionally regulated only at the first and last enzymes of
the pathway. This categorization explained some cases of low
co-expression. Thus, by pairing the CBM network topology with
tran-scriptional regulatory networks, this study was able to
outline key principles of metabolic regulation for different types
of metabolic pathways.
Targeted expansion of metabolic knowledgeThe studies discussed
above focus on integrating CBMs with large-scale data sets to gain
mechanistic under-standing. However, incomplete knowledge of the
meta-bolism of the target cell leads to inaccurate predictions. One
feature of computational models is that incorrect predictions can
identify missing or incorrect metabolic knowledge. Thus, the
discrepancies between CBM predictions and experimental data have
been used to design targeted experiments that correct such
inaccuracy in metabolic knowledge61,62.
Discovering new human metabolic capabilities. The initial
reconstruction of the global human metabolic network Recon 1
(REF.63) is incomplete owing to gaps in our knowledge of human
metabolism. Thus, Recon 1 is missing metabolic reactions. Using an
established protocol62, one study64 identified such gaps in our
knowledge by simulating either production or consumption of every
metabolite to assess whether the metabolite was fully connected to
the rest of the net-work. For the metabolites that were not fully
connected, a universal database of metabolic reactions65 was used
to predict the fewest reactions that were required to fully connect
them. The authors found 73 candidate gap-filling solutions that
fully connected the disconnected metabolites, 47 of which were
supported by the litera-ture. Focusing on gluconate, which is a
disconnected metabolite, the authors experimentally characterized
open reading frame 103 on chromosome 9 (C9orf103) as the gene that
encodes gluconokinase. This study illus-trates how a
self-consistent model of metabolism guides researchers to refine
experiments to fill in missing gaps in our current knowledge.
Nature Reviews | Genetics
Pathwayanalysis
Low protein investment
High protein investment
Regulation of pathwaysdiffers based on protein cost
NAD+
L-asp
L-trp
Model refinement
WT
bna
Growth screens
Single mutant fitness
Deg
ree
of e
pist
asis
Global properties
Incorrect model predictions
a Characterizing genetic interactions b Characterizing
transcriptional regulation
Determinepathway co-regulation
FBA knockoutcomparison
Metabolic network Genetic interactionnetwork
Figure 2 | Predictive case studies in understanding underlying
principles of interaction networks. Many network types are used to
represent cellular behaviour. Recent studies have compared the
properties of interaction networks against constraint-based models
(CBMs) to learn global principles. a | One study55 compared an
experimental set of genetic interactions for metabolic genes
against interactions that were predicted by flux-balance analysis
(FBA). The CBM was able to recapitulate many of the invivo
principles. However, there was a high number of incorrect model
predictions. Using machine learning techniques, key changes to the
metabolic network that would improve model accuracy were
identified. Using growth screens, the authors validated that the
synthesis of NAD+ from amino acids was only possible from
l-tryptophan (l-trp) but not from laspartate (l-asp). bna refers to
any of the genes that are related to the kynurenine pathway,
including bna1, bna2, bna4 and bna5. b | Another study57 calculated
metabolic pathways Elementary Flux Patterns for the network.
Elementary Flux Patterns decompose the metabolic network into
distinct functional pathways (shown by different colours). The
degree of co-regulation of the genes of each pathway was
calculated, which reveals that some pathways are highly correlated,
whereas others are not. Variation in co-regulation was attributed
to the cost that is needed for building the proteins in a
particular pathway.
R E V I E W S
NATURE REVIEWS | GENETICS ADVANCE ONLINE PUBLICATION | 7
2014 Macmillan Publishers Limited. All rights reserved
-
AuxotrophiesMetabolic limitations that impair the ability of a
cell or organism to synthesize a particular metabolite that is
essential for growth, which force the cell or organism to rely on
an exogenous source of the nutrient.
Discovering enzyme functions in E.coli. Constraint-based
modelling has been used to discover biochemical knowledge about the
well-characterized metabolic net-work of E.coli. Through systematic
genetic perturbation of E.coli central carbon metabolism, one
study66 dis-covered a novel pathway and previously uncharacterized
enzymatic functions. Single, double and triple knock-out strains of
central metabolic genes were grown on 13 different growth
conditions with various carbon sources to determine positive and
negative genetic interactions. Concurrently, genetic interactions
were predicted using the E.coli CBM45. After careful removal of
false predic-tions that were due to model assumptions (for example,
the inability of FBA to differentiate between major and minor
isozymes, as enzyme abundance and kinetic acti-vity are not
captured), it was observed that discrepan-cies that were related to
talAB interactions in the pentose phosphate pathway could not be
reconciled. To deter-mine the cause, the authors generated
transcription and metabolite profiling data for the wild-type and
knockout strains. A metabolomic analysis identified a new
meta-bolite sedoheptulose-1,7-bisphosphate that had not been
previously characterized in E.coli, which suggests the existence of
a novel reaction. Using metabolic flux analysis and invitro enzyme
assays, they confirmed that phosphofructokinase carries out the
reaction and that glycolytic aldolase can split the seven-carbon
sugar into three- and four-carbon sugars. Thus, the detailed
analysis of the CBM against data discrepancies found two new
catalytic functions of classic glycolytic enzymes.
Designing metabolic phenotypesCBMs have been used for
translational applications, including the design of metabolic
phenotypes. In the past ten years, many algorithms have been
developed for predicting useful genetic manipulation strategies for
metabolic engineering67,68. They have also been impor-tant in
assessing the net energy balance and the level of greenhouse gas
emission for bioethanol and biodiesel production69. Here, we
discuss one recent CBM success in this field of research.
Production of non-natural, commodity chemicals. There has been a
push to use biotechnology to produce com-modity chemicals. To this
end, one study70 designed an E.coli strain that produces
1,4-butanediol (BDO) at high yields. Two key hurdles were overcome
using computa-tional methods. First, BDO is not a naturally
occurring compound in any organism. The authors used a pathway
prediction algorithm71 that determines the necessary biochemical
transformations to convert an endogenous E.coli metabolite to BDO.
A final pathway was chosen on the bases of thermodynamic
feasibility72, the theoreti-cal yield of BDO (which was determined
using FBA), the number of known enzymes for the biochemical
trans-formations (which was determined using pathway data-bases73)
and the topological distance of the pathway from central carbon
metabolism. Second, when the pathway was introduced, the organism
did not produce BDO at high rates; thus, a rational approach to
producing a met-abolic design was pursued using the E.coli CBM45
and
the OptKnock algorithm74. A four-knockout strategy that blocked
the production of natural fermentation products was chosen to force
the strain to balance redox and to channel all carbon flux through
BDO production (FIG.3a). Further genetic manipulations were needed
to create the final strain, which included modifying tran-scription
factors, swapping E.coli metabolic enzymes with non-native enzymes
and optimizing codons. There are many hurdles to designing a
production strain, but this study shows that CBMs can have a vital
role in accelerating and completing the industrial strain design
pipeline to produce non-natural metabolites.
Discovering drug targetsThe ability of constraint-based
modelling to predict the effects of gene knockouts provides an
important tool for drug targeting studies75. Three recent
experimentally validated studies have discovered novel cancer drug
targets and antibiotics.
Exploiting deficiencies in cancer metabolism. There has been
renewed interest in studying metabolic alterations that occur in
cancer cells76. In two studies77,78, researchers hypothesized that
they could use CBMs to determine and exploit the metabolic
auxotrophies of cancer cells. The first study77 used a
model-building algorithm79 that uses cues from transcriptomic data
to prune metabolic reactions from Recon1 (REF.63) in order to build
a generic cancer model (FIG.1b). They then used FBA-predicted
knockout phenotypes to determine cytostatic drug targets that
selectively block growth of the cancer model but that do not affect
ATP generation and growth of the healthy Recon1 model.
Interestingly, even though cancer cells have heterogeneous
genotypes and pheno-types, it was found that approved or
experimental cancer drugs exist for 40% of the determined
cytostatic drug targets. Analyses of growth phenotypes using CBMs
focus on the capacity for growth under single knockout condi-tions.
The surprising agreement between computational predictions and
experimental results for a generic model suggests that the
metabolic capabilities of cancer cells are starkly different from
those of healthy human cells, which allows drug combinations to be
detected.
In a follow-up study78, the researchers experimentally
investigated fumarate hydratase deficiency that can cause
hereditary leiomyomatosis and renal cell cancer. At the time, no
mechanism for reduced NADH regeneration was known for fumarate
hydratase-deficient cells. The researchers immortalized and
constructed two cell lines, one of which expressed fumarate
hydratase and one that was deficient in fumarate hydratase.
Starting from Recon1, transcription data were used to build two
cell line-specific models. By comparing predicted knockout growth
phenotypes, they identified a selectively essen-tial pathway for
haem biosynthesis and degradation in fumarate hydratase-deficient
cells, which represented a potential mechanism for NADH
regeneration. Haem oxygenase1 (HMOX1) was experimentally inhibited
in both cell lines, and fumarate hydratase-deficient cells were
selectively killed, which shows that fumarate hydratase and HMOX1
are in fact synthetically lethal as predicted.
R E V I E W S
8 | ADVANCE ONLINE PUBLICATION
www.nature.com/reviews/genetics
2014 Macmillan Publishers Limited. All rights reserved
-
Reaction boundsUser-defined constraints on the minimum and
maximum allowable flux values for a particular metabolic reaction
in a constraint-based model.
Metabolite essentiality analysisA metabolite-centric approach to
determine essential components for cellular growth. To
computationally test the essentiality of a metabolite, the
consuming reactions of the particular metabolites are constrained
to zero.
Interestingly, the haem pathway ranked only 39th in terms of
overexpressed pathways in fumarate hydratase-deficient cells, which
meant that the predic-tions were only possible through combining
expression data with the CBM. These results are a step towards
identifying effective anticancer drugs using genome-scale metabolic
knowledge. As the predictions of the CBM focus on differential
metabolic capacities, there is a potential for false-negative
predictions, as additional layers of differences are not taken into
account. In addi-tion, it will be interesting to see whether these
methods can be extended to other cancer types in which germline
mutations are either unknown or absent. The identifi-cation of the
haem pathway as synthetically lethal with fumarate hydratase
represents a key success in using CBM predictions for prospective
experimentation for studying human disease.
Essential metabolites guide antibiotic discovery. Gene knockout
simulations in CBMs are accomplished by constraining the
gene-associated reaction bounds to zero (BOX1). Moving past a
gene-centric approach, an alternative approach for drug targeting
is metabolite essentiality analysis80 (FIG.3b). To remove a
metabolite in a CBM, the bounds of the reactions that consume
the
metabolite are constrained to zero, and the steady-state
constraint for that particular metabolite is relaxed to allow
internal metabolite accumulation.
One study81 reconstructed a genome-scale meta-bolic network for
Vibrio vulnificus, which is a Gram-negative pathogen. By applying
metabolite essentiality analysis, the authors found 193 metabolites
that are essential to cellular growth. They narrowed the list down
to five essential metabolites that represented promising targets
for drug development by removing metabolites that are found in
humans to lower poten-tial toxic adverse effects and by removing
metabolites that have a single consuming reaction for a more robust
effect on the pathogen. The identified metabolites typi-cally
affect a single gene, which means that traditional reaction
knockouts could have beenused.
However, using a metabolite-centric approach has its advantages.
It allowed the authors to search for struc-tural analogues of the
essential metabolites to inhibit the enzymes that relied on them as
substrates. They screened the inhibitory capability of 352
compounds that were structurally similar to the predicted essential
metabolites, and the most potent compound was chosen for further
evaluation as an antibiotic. The compound was confirmed to bind to
the target enzyme in folate
Figure 3 | Predictive case studies in metabolic engineering and
drug targeting. Constraint-based models have been used for
answering important questions in translational research. a | One
study70 used multiple computational and experimental tools to
design an Escherichiacoli strain that produces 1,4-butanediol
(BDO). An unengineered wild-type (WT) strain trades off metabolite
production with cellular growth (shown by the solid line in the
solution space). Using the OptKnock algorithm, BDO production was
coupled with the growth objective of the cell by forcing the
synthetic BDO pathway to be the sole route for E.coli to maintain
redox balance (shown by black arrows). Thus, the solution space is
modified such that BDO production is linked to cellular growth
(shown by the dashed line in the solution space). b | In one
study81, researchers took an alternative, metabolite-centric
approach to drug targeting, which computationally removes consuming
reactions of a particular metabolite. The approach was
experimentally confirmed for Vibrio vulnificus by a structural
analogue of the endogenous metabolite, which also acts as a
small-molecule inhibitor. c | Metabolic reactions in the E.coli
model were augmented to capture the generation of reactive oxygen
species (ROS), which allowed the use of flux-balance analysis to
predict ROS production in one study82. In follow-up experiments,
the authors show that it is possible to predict drug target
strategies to enhance endogenous ROS production to increase the
efficacy of other antibiotics. TCA, tricarboxylic acid cycle.
Nature Reviews | Genetics
Growth rate
Prod
uct s
ecre
tion
WT
OptKnockstrain
Small-moleculeinhibition
Lactate
ReductiveTCA
Glycolysis
Oxidative TCA
BDONAD+
NAD+
2 NAD+
O
O
O
O
O
O
N+
N+Structuralanalogue
Formate Ethanol
NAD+NAD+
ROS-augmented reactionsthat use flavins, quinones and transition
metalsReaction knockout
Metabolite
H2O2 and O2
c Predicting ROS production
Metabolite deletion
Gene deletion
b Drug discoverya Metabolic engineering
R E V I E W S
NATURE REVIEWS | GENETICS ADVANCE ONLINE PUBLICATION | 9
2014 Macmillan Publishers Limited. All rights reserved
-
Coupling constraintsConstraints that enforce strict
relationships between model biochemical transformations, thereby
connecting the fluxes for different cellular processes (such as
transcription, translation, and tRNA and protein use for a
metabolic reaction).
Linear programmingA mathematical optimization technique that
calculates the maximum or minimum value of a particular variable
(that is, the objective function) on the basis of a set of linear
constraints; an example of this is flux-balance analysis.
Consensus sequencesConserved sequences of nucleotides or amino
acids that represent the target for a biomolecular event, often for
proteins binding to the genome.
biosynthesis, which validated the CBM and chemo-informatic
predictions. Furthermore, they found that the compound was more
effective than current anti-bacterials. By using a CBM to analyse
metabolism, antibiotic discovery can be approached from multiple
perspectives (for example, from the perspective of a gene, a
reaction or a metabolite).
Increasing antibiotic efficacy through ROS production. ROS can
weaken and kill pathogens, and modulation of ROS production could
therefore be used as part of an antimicrobial strategy. As a proof
of concept, one study82 predicted genetic engineering strategies in
E.coli to increase internal ROS generation in order to increase
antibiotic efficacy. The current E.coli CBM45 does not account for
the major sources of ROS production. Thus, 133 metabolic reactions
with potential for ROS generation were augmented in the E.coli CBM
(FIG.3c). With the updated CBM, computed flux distributions of
single knockouts included a quantitative readout of ROS generation.
Thus, gene knockouts that increase the endogenous ROS production
were predicted, many of which increased inefficiencies in
production or usage of ATP. For validation, the researchers
experimentally knocked out genes that were predicted to increase
endogenous ROS production, as well as genes that were predicted to
have no effect as negative controls. There was high qualitative
concordance of the predictions with experimental measurements of
ROS production, which suggests that a CBM could be used to tune ROS
production. These results are striking because little quantitative
information was necessary in the coupling of flux with ROS
production and because a statistical ensemble approach was used to
account for unknown parameters. This study was able to predict
genetic engineering strategies that were proven to increase ROS
production and to potentiate oxidative attack from oxidants and
antibacterials, which provides a novel approach for antibiotic
discovery.
Coupling with other cellular processesConstraint-based modelling
has been mainly applied to metabolism. However, researchers have
recently extended the scope of CBMs and combined them with
different modelling methods to address ques-tions beyond
metabolism. Two approaches that have emerged are extending CBMs of
metabolism to include additional cellular processes and connecting
different modelling methods.
Modelling transcription, translation and metabolism. CBMs have
been reconstructed for cellular processes other than metabolism912.
However, until recently, CBMs of different cellular processes had
not been inte-grated. One study83 integrated a CBM of Thermotoga
maritima metabolism84 with a CBM for transcription and
translation83 (FIG.4a). By adding information about the
transcription and translation machinery, the CBM accounts for mRNA
transcription, protein transla-tion, necessary post-translational
modifications of proteins and use of the protein complex to
catalyse
metabolic reactions in T.maritima (FIG.4a). To couple the
necessary machinery for a particular metabolic reaction, the
authors used coupling constraints85 that mathematically link a
metabolic reaction flux with its required molecular and enzymatic
machinery in formulating the linear programming problem. The result
is an integrated network reconstruction that contains the molecular
biology and metabolism of T.maritima at the genome scale and that
allows the computation of the functional proteome that is needed to
express a given phenotype. The incorporation of new cellular
processes in the constraint-based modelling framework is exciting
but requires additional parameterization of enzyme efficiencies
under different biological condi-tions. A key challenge for the
improvement and the use of these new models is the development of
para-meterization techniques that are driven by proteomic and
transcriptomicdata.
The integrated model hopes to address some of the crucial
challenges that have limited metabolic CBMs. First, the integrated
model takes into account the vari-ability of cellular composition
at different growth rates, while metabolic CBMs only use one
biomass function for growth rate optimization. Cellular composition
is dependent on growth rate, and metabolic CBMs have previously
accounted for variations in growth rate with different cellular
compositions86. However, an integrated model explicitly represents
nucleotide and protein demands as a function of growth rate, so
that a traditional biomass function is no longer necessary. Second,
by coupling transcript and protein synthesis with active metabolic
reactions, the authors quantitatively predicted differential
experimental tran-scriptome and proteome levels across varying
condi-tions. They used upstream genomic sequences of the
differentially expressed genes to determine putative consensus
sequences for transcription factor binding. The newly derived
sequences helped to identify a can-didate metabolite transporter,
which was subsequently verified experimentally87. Finally, by
incorporating the required demands for the machinery for
metabolism, the integrated CBM unifies the three-objective Pareto
surface that was discussed above into a single objec-tive88. Thus,
as the content of these models increases, the ability of CBMs to
explain and predict biological functions grows inscope.
Merging statistics with mechanistic networks. Statistical
approaches are useful when there is limited know-ledge of the
underlying biological networks. Unlike metabolic networks, the
functional states of transcrip-tional regulatory networks (TRNs)
are harder to define mechanistically because the underlying
biochemistry and biophysics are often unknown. One study modelled
the cellular processes of metabolism and transcrip-tional
regulation using two different modelling formulations, which
included a CBM for metabolism and a probability metric that is
based on omic data for the TRN89 (FIG.4b). For E.coli and
Mycobacterium tuberculosis, the authors amassed the available
transcriptional profiling data sets and the existing
R E V I E W S
10 | ADVANCE ONLINE PUBLICATION
www.nature.com/reviews/genetics
2014 Macmillan Publishers Limited. All rights reserved
-
transcriptional regulatory interaction networks. Rather than
using a Boolean formulation for the TRN90, they calculated the
probabilities of activation and repression on the basis of the
collected expression data sets for each pair of transcription
factor and targetgene.
Similarly to how basic constraints are added (BOX2), the TRN was
combined with metabolism by adjusting upper and lower bounds of
individual metabolic reac-tions in the CBM on the basis of both the
calculated
probabilities of activation of associated target meta-bolic
genes and the allowable flux states (which are determined by FVA)
(FIG.4b). The integrated E.coli metabolic regulatory model was more
accurate in pre-dicting transcription factor-knockout phenotypes
than pre vious attempts that used integrated models90. The newly
developed integrated M.tuberculosis network predicted drug targets
and aided the identification of novel regulatory roles of
transcription factors. Although
Transcription
Translation
Amino acidsNTPs
Genome
Simultaneously predicting metabolicfluxes, gene expression and
protein levels
Predicting TF-knockout phenotypes
Simulated
Expe
rim
enta
l
Predicting thermotolerance of E. coli
Integrated metabolic model reaction
Traditional metabolic model reaction
P(Enz|TF) = 0.5
TF knockout
Med
ia
Temperature (C)
Gro
wth
rate
- 0
Expanded modelling method
Expanded scope of predictions
-0
Metabolic network
Integrating a probabilistictranscriptional regulatory
network
Integrating structuralsystems biology
Expanding to explicitly account fortranscription and translation
machinery
Nature Reviews | Genetics
a b c
Figure 4 | Expanding predictive scope through integrative
modelling. The predictive scope of constraint-based modelling has
been extended beyond metabolism either by explicitly accounting for
non-metabolic components in the constraint-based modelling approach
or by coupling with other modelling frameworks. Metabolites are
represented by circles. a | The transcription and translation of
the necessary mRNA, proteins and cofactors have been explicitly
represented in a constraint-based modelling framework alongside the
metabolism of Thermotogamaritima83 (upper panel). This allows
simultaneous computation of metabolic fluxes, mRNA transcript
expression and proteome levels (lower panel). b | Metabolic models
have also been coupled with other modelling frameworks. The
probability of metabolic gene activation and repression by
transcription factors (TFs) can be computed using a probabilistic
transcriptional regulatory network that is based on high-throughput
data sets (upper panel). The calculated probabilities are then
relayed into the constraints of the metabolic reaction fluxes in
the constraint-based model89, which allow prediction of TF-knockout
phenotypes (lower panel). c | Structural systems biology can
predict biophysical properties of proteins. One study91 calculated
the individual activity changes of each metabolic enzyme during
temperature shift. The combined effect of all the metabolic enzymes
on the cell was computed by integrating the individual enzyme
changes into the flux constraints of the Escherichiacoli
constraint-based model (upper panel), which allowed growth rate to
be predicted as a function of temperature (lower panel). Enz,
enzyme; NTP, nucleoside 5-triphosphate; P, probability.
R E V I E W S
NATURE REVIEWS | GENETICS ADVANCE ONLINE PUBLICATION | 11
2014 Macmillan Publishers Limited. All rights reserved
-
the TRN modifies the CBM of metabolism, the calcu-lated flux
distributions and the metabolites that are present do not feedback
to parameterize the TRN. A further improvement of integrated
modelling between transcriptional and metabolic networks will be to
include feedback mechanisms from metabolism.
Structural systems biology. One study expanded the E.coli
metabolic network by including the experi-mentally derived protein
structures (where available) and the computationally predicted
protein structures for the metabolic enzymes in the network91
(FIG.4c). Using structural bioinformatic92 techniques, changes in
enzyme activity were predicted and were used as constraints on the
activity of individual metabolic reac-tions. The researchers
focused on thermostability of E.coli enzymes to study growth rate
as a function of temperature. With this approach, they were able to
computationally predict growth rates at varying tem-peratures,
which were consistent with experimental data. The growth-limiting
enzymes were then deter-mined on the basis of temperature-dependent
flux constraints. Although other temperature-dependent parameters,
such as cellular composition, were not considered93, the predicted
growth-limiting enzymes significantly overlapped with mutated genes
from a previous study94 on adaptive evolution of E.coli to higher
temperatures. For direct experimental valida-tion, the
growth-constraining enzymes were bypassed by supplementing growth
media with the metabolic product of the enzyme or of the pathway to
which it belongs. Such supplementation was beneficial for E.coli
that was grown at superoptimal temperatures, which supported the
predictive capability of CBMs to account for enzyme
thermosensitivity. These promis-ing results raise the prospect of
the substantial effects that structural modelling might have on
improving CBM predictions in thefuture.
ConclusionsGregor Mendel described discrete quanta of
informa-tion travelling from one generation to the next, which
determines the form and the function of an organism. Subsequently,
Wilhelm Johannsen formulated the con-cept of a gene as the quanta
of information, which led him to the definition of genotype and
phenotype. Since then, a major goal of biology has been the
quantitative description of the fundamental genotypephenotype
relationship.
The push in the quantitative biological sciences to understand
macroscopic properties from microscopic measurements has parallels
to the elucidation of funda-mental principles in physics several
hundred years ago. For example, the EinsteinSmoluchowski relation
is a model for Brownian motion that quantitatively predicts
properties of diffusion. Although the theory was an approximation
of the physical processes95, it has been applied and has helped to
develop more sophisticated models. This Review suggests that the
life sciences have now reached a point at which many aspects of the
genotypephenotype relationship for metabolism can
be quantified and used to build mechanistic models that allow
meaningful biological predictions to be made. The formulation of
high-dimensional models that are required to compute full molecular
phenotypes are enabled by genome sequencing technology, which
allows the generation of a cellular parts list; by various omic
data types, which allows a functional readout of these parts; and
by mechanistic modelling frameworks that are amenable to
reconciling omic data, network structure and knowledge from primary
literature. The successes of the 14 studies discussed here
demonstrate that constraint-based modelling is an approach that
enables the genome-scale study of metabolism.
As with any model, the mathematical theory and the applications
of constraint-based modelling will continue to be challenged and
refined, thus improv-ing our interpretation of biological
phenomena. We foresee progress to unfold in several major
directions. First, constraint-based modelling has mainly focused on
metabolism, and more integrative modelling approaches must be
explored. Trends in current litera-ture89,91,96 indicate that other
cellular processes may be modelled using alternative frameworks
that are better suited for a particular biological phenomena.
Statistical approaches are also powerful for modelling biological
processes that are poorly understood. Integrating other approaches
with CBMs of metabolism can expand the scope of quantitative
prediction. Second, the majority of applications of CBMs have been
for single-cell organisms. We see two areas of application into
which CBMs are likely to expand: human disease and the microbiome.
Although the human reconstruction (that is, Recon 1) is far from
complete, the cancer drug tar-get studies showed that quantitative
predictions are still possible. With the availability of the second
build (that is, Recon 2)97, we foresee greater applied uses in
human disease. There has also been a steady increase in the amount
of omic data of the human microbiome, and CBMs will have an
important role in analysing these complex data sets98,99. Third,
the underlying assump-tions and methods for constraint-based
modelling analyses will continue to evolve as more data types
become available. Similarly to the testing of optimality
assumptions of FBA, other key assumptions of CBMs will be tested in
the next few years. For example, with the increasing availability
of time-course metabo-lomics, the steady-state assumption can be
bypassed and concentration changes can be explicitly modelled.
Rather than assuming constant internal metabolite levels, these
concentrations can be directly measured over a time course in an
experiment and the rate of change can be integrated explicitly. In
addition, the increasing availability of genomic data and
sophis-ticated models for the interpretation of these data will
allow explicit description and integration of the dependence of
genomic sequence on gene expression, protein synthesis and protein
structures for metabolic reactions in CBMs. We anticipate that
these develop-ments will enable even greater growth in the
diversity of predictions and in the biological discoveries that are
achievable by using constraint-based modelling.
R E V I E W S
12 | ADVANCE ONLINE PUBLICATION
www.nature.com/reviews/genetics
2014 Macmillan Publishers Limited. All rights reserved
-
1. Feist,A.M., Herrgard,M.J., Thiele,I., Reed,J.L. &
Palsson,B.O. Reconstruction of biochemical networks in
microorganisms. Nature Rev. Microbiol. 7, 129143 (2009).This is a
review on constructing and validating a genome-scale metabolic
network.
2. Thiele,I. & Palsson,B.O. A protocol for generating a
high-quality genome-scale metabolic reconstruction. Nature Protoc.
5, 93121 (2010).
3. Lewis,N.E., Nagarajan,H. & Palsson,B.O. Constraining the
metabolic genotypephenotype relationship using a phylogeny of
insilico methods. Nature Rev. Microbiol. 10, 291305 (2012).This is
a thorough review of the various constraint-based modelling
methodologies.
4. Zhuang,K. etal. Genome-scale dynamic modeling of the
competition between Rhodoferax and Geobacter in anoxic subsurface
environments. ISME J. 5, 305316 (2011).
5. Klitgord,N. & Segre,D. Environments that induce synthetic
microbial ecosystems. PLoS Comput. Biol. 6, e1001002 (2010).
6. Bordbar,A. etal. A multi-tissue type genome-scale metabolic
network for analysis of whole-body systems physiology. BMC Syst.
Biol. 5, 180 (2011).
7. Bordbar,A., Lewis,N.E., Schellenberger,J., Palsson,B.O. &
Jamshidi,N. Insight into human alveolar macrophage and
M.tuberculosis interactions via metabolic reconstructions. Mol.
Syst. Biol. 6, 422 (2010).
8. Lewis,N.E. etal. Large-scale insilico modeling of metabolic
interactions between cell types in the human brain. Nature Biotech.
28, 12791285 (2010).
9. Papin,J.A. & Palsson,B.O. The JAKSTAT signaling network
in the human B-cell: an extreme signaling pathway analysis.
Biophys. J. 87, 3746 (2004).
10. Li,F., Thiele,I., Jamshidi,N. & Palsson,B.O.
Identification of potential pathway mediation targets in Toll-like
receptor signaling. PLoS Comput. Biol. 5, e1000292 (2009).
11. Gianchandani,E.P., Joyce,A.R., Palsson,B.O. & Papin,J.A.
Functional states of the genome-scale Escherichia coli
transcriptional regulatory system. PLoS Comput. Biol. 5, e1000403
(2009).
12. Thiele,I., Jamshidi,N., Fleming,R.M. & Palsson,B.O.
Genome-scale reconstruction of Escherichia colis transcriptional
and translational machinery: a knowledge base, its mathematical
formulation, and its functional characterization. PLoS Comput.
Biol. 5, e1000312 (2009).
13. Fell,D.A. & Small,J.R. Fat synthesis in adipose tissue.
An examination of stoichiometric constraints. Biochem. J. 238,
781786 (1986).
14. Majewski,R.A. & Domach,M.M. Simple constrained
optimization view of acetate overflow in E.coli. Biotechnol.
Bioeng. 35, 732738 (1990).
15. Savinell,J.M. & Palsson,B.O. Optimal selection of
metabolic fluxes for invivo measurement. II. Application to
Escherichia coli and hybridoma cell metabolism. J.Theor. Biol. 155,
215242 (1992).
16. Varma,A. & Palsson,B.O. Stoichiometric flux balance
models quantitatively predict growth and metabolic by-product
secretion in wild-type Escherichia coli W3110. Appl. Environ.
Microbiol. 60, 37243731 (1994).
17. Schuster,S. & Hilgetag,C. On elementary flux modes in
biochemical reaction systems at steady state. J.Biol. Systems 2,
165182 (1994).
18. Schilling,C.H., Letscher,D. & Palsson,B.O. Theory for
the systemic definition of metabolic pathways and their use in
interpreting metabolic function from a pathway-oriented
perspective. J.Theor. Biol. 203, 229248 (2000).
19. Clarke,B.L. in Advances in Chemical Physics Vol. 43 (eds.
Prigogine, I. & Rice,S.A.) 1215 (Wiley, 1980).
20. Dandekar,T., Schuster,S., Snel,B., Huynen,M. & Bork,P.
Pathway alignment: application to the comparative analysis of
glycolytic enzymes. Biochem. J. 343, 115124 (1999).
21. Liao,J.C., Hou,S.Y. & Chao,Y.P. Pathway analysis,
engineering and physiological considerations for redirecting
central metabolism. Biotechnol. Bioeng. 52, 129140 (1996).
22. Fleischmann,R.D. etal. Whole-genome random sequencing and
assembly of Haemophilus influenzae Rd. Science 269, 496512
(1995).
23. Edwards,J.S. & Palsson,B.O. Systems properties of the
Haemophilus influenzae Rd metabolic genotype. J.Biol. Chem. 274,
1741017416 (1999).
24. Edwards,J.S., Ibarra,R.U. & Palsson,B.O. In silico
predictions of Escherichia coli metabolic capabilities are
consistent with experimental data. Nature Biotech. 19, 125130
(2001).
25. Segre,D., Vitkup,D. & Church,G.M. Analysis of optimality
in natural and perturbed metabolic networks. Proc. Natl Acad. Sci.
USA 99, 1511215117 (2002).
26. Stelling,J., Klamt,S., Bettenbrock,K., Schuster,S. &
Gilles,E.D. Metabolic network structure determines key aspects of
functionality and regulation. Nature 420, 190193 (2002).
27. Ibarra,R.U., Edwards,J.S. & Palsson,B.O. Escherichia
coli K-12 undergoes adaptive evolution to achieve insilico
predicted optimal growth. Nature 420, 186189 (2002).
28. Almaas,E., Kovacs,B., Vicsek,T., Oltvai,Z.N. &
Barabasi,A.L. Global organization of metabolic fluxes in the
bacterium Escherichia coli. Nature 427, 839843 (2004).
29. Papp,B., Pal,C. & Hurst,L.D. Metabolic network analysis
of the causes and evolution of enzyme dispensability in yeast.
Nature 429, 661664 (2004).
30. Pal,C., Papp,B. & Lercher,M.J. Adaptive evolution of
bacterial metabolic networks by horizontal gene transfer. Nature
Genet. 37, 13721375 (2005).
31. Hyduke,D.R., Lewis,N.E. & Palsson,B.O. Analysis of omics
data with genome-scale models of metabolism. Mol. Biosyst 9, 167174
(2013).This is a review of techniques to integrate omic data with
CBMs.
32. Patil,K.R. & Nielsen,J. Uncovering transcriptional
regulation of metabolism by using metabolic network topology. Proc.
Natl Acad. Sci. USA 102, 26852689 (2005).
33. Kharchenko,P., Church,G.M. & Vitkup,D. Expression
dynamics of a cellular metabolic network. Mol Syst Biol 1,
2005.0016 (2005).
34. Shlomi,T., Cabili,M.N., Herrgard,M.J., Palsson,B.O. &
Ruppin,E. Network-based prediction of human tissue-specific
metabolism. Nature Biotech. 26, 10031010 (2008).
35. Becker,S.A. & Palsson,B.O. Context-specific metabolic
networks are consistent with experiments. PLoS Comput. Biol. 4,
e1000082 (2008).
36. Carlson,R. &Srienc,F. Fundamental Escherichia coli
biochemical pathways for biomass and energy production: creation of
overall flux states. Biotechnol. Bioeng. 86, 149162 (2004).
37. Carlson,R. &Srienc,F. Fundamental Escherichia coli
biochemical pathways for biomass and energy production:
identification of reactions. Biotechnol. Bioeng. 85, 119
(2004).
38. Harcombe,W.R., Delaney,N.F., Leiby,N., Klitgord,N. &
Marx,C.J. The ability of flux balance analysis to predict evolution
of central metabolism scales with the initial distance to the
optimum. PLoS Comput. Biol. 9, e1003091 (2013).
39. Schuetz,R., Kuepfer,L. & Sauer,U. Systematic evaluation
of objective functions for predicting intracellular fluxes in
Escherichia coli. Mol Syst Biol. 3, 119 (2007).
40. Molenaar,D., van Berlo,R., de Ridder,D. & Teusink,B.
Shifts in growth strategies reflect tradeoffs in cellular
economics. Mol. Syst. Biol. 5, 323 (2009).
41. Schuetz,R., Zamboni,N., Zampieri,M., Heinemann,M. &
Sauer,U. Multidimensional optimality of microbial metabolism.
Science 336, 601604 (2012).
42. Lewis,N.E. etal. Omic data from evolved E.coli are
consistent with computed optimal growth from genome-scale models.
Mol. Syst. Biol. 6, 390 (2010).
43. Khersonsky,O. & Tawfik,D.S. Enzyme promiscuity: a
mechanistic and evolutionary perspective. Annu. Rev. Biochem. 79,
471505 (2010).
44. Nam,H. etal. Network context and selection in the evolution
to enzyme specificity. Science 337, 11011104 (2012).
45. Feist,A.M. etal. A genome-scale metabolic reconstruction for
Escherichia coli K-12 MG1655 that accounts for 1260 ORFs and
thermodynamic information. Mol Syst Biol 3, 121 (2007).
46. Baba,T. etal. Construction of Escherichia coli K-12
in-frame, single-gene knockout mutants: the Keio collection. Mol
Syst Biol. 2, 2006.0008 (2006).
47. Scheer,M. etal. BRENDA, the enzyme information system in
2011. Nucleic Acids Res. 39, D670D676 (2011).
48. Lobel,L., Sigal,N., Borovok,I., Ruppin,E. &
Herskovits,A.A. Integrative genomic analysis identifies isoleucine
and CodY as regulators of Listeria monocytogenes virulence. PLoS
Genet. 8, e1002887 (2012).
49. Costanzo,M. etal. The genetic landscape of a cell. Science
327, 425431 (2010).
50. Uetz,P. etal. A comprehensive analysis of proteinprotein
interactions in Saccharomyces cerevisiae. Nature 403, 623627
(2000).
51. Gama-Castro,S. etal. RegulonDB version 7.0: transcriptional
regulation of Escherichia coli K-12 integrated within genetic
sensory response units (Gensor Units). Nucleic Acids Res. 39,
D98D105 (2011).
52. Segre,D., DeLuna,A., Church,G.M. & Kishnoy,R. Modular
epistasis in yeast metabolism. Nature Genet. 37, 7783 (2005).
53. Harrison,R., Papp,B., Pal,C., Oliver,S.G. & Delneri,D.
Plasticity of genetic interactions in metabolic networks of yeast.
Proc. Natl Acad. Sci. USA 104, 23072312 (2007).
54. He,X., Qian,W., Wang,Z., Li,Y. & Zhang,J. Prevalent
positive epistasis in Escherichia coli and Saccharomyces cerevisiae
metabolic networks. Nature Genet. 42, 272276 (2010).
55. Szappanos,B. etal. An integrated approach to characterize
genetic interaction networks in yeast metabolism. Nature Genet. 43,
656662 (2011).
56. Mo,M.L., Palsson,B.O. & Herrgard,M.J. Connecting
extracellular metabolomic measurements to intracellular flux states
in yeast. BMC Syst. Biol. 3, 37 (2009).
57. Wessely,F. etal. Optimal regulatory strategies for metabolic
pathways in Escherichia coli depending on protein costs. Mol. Syst.
Biol. 7, 515 (2011).
58. Notebaart,R.A., Teusink,B., Siezen,R.J. & Papp,B.
Co-regulation of metabolic genes is better explained by flux
coupling than by network distance. PLoS Comput. Biol. 4, e26
(2008).
59. Kaleta,C., de Figueiredo,L.F. & Schuster,S. Can the
whole be less than the sum of its parts? Pathway analysis in
genome-scale metabolic networks using elementary flux patterns.
Genome Res. 19, 18721883 (2009).
60. Faith,J.J. etal. Many Microbe Microarrays Database:
uniformly normalized Affymetrix compendia with structured
experimental metadata. Nucleic Acids Res. 36, D866D870 (2008).
61. Orth,J.D. & Palsson,B.O. Systematizing the generation of
missing metabolic knowledge. Biotechnol. Bioeng. 107, 403412
(2010).This is a review on techniques and applications of CBMs for
a targeted expansion of biochemical knowledge.
62. Reed,J.L. etal. Systems approach to refining genome
annotation. Proc. Natl Acad. Sci. USA 103, 1748017484 (2006).
63. Duarte,N.C. etal. Global reconstruction of the human
metabolic network based on genomic and bibliomic data. Proc. Natl
Acad. Sci. USA 104, 17771782 (2007).
64. Rolfsson,O., Paglia,G., Magnusdottir,M., Palsson,B.O. &
Thiele,I. Inferring the metabolism of human orphan metabolites from
their metabolic network context affirms human gluconokinase
activity. Biochem. J. 449, 427435 (2013).
65. Kanehisa,M., Goto,S., Sato,Y., Furumichi,M. & Tanabe,M.
KEGG for integration and interpretation of large-scale molecular
data sets. Nucleic Acids Res. 40, D109D114 (2012).
66. Nakahigashi,K. etal. Systematic phenome analysis of
Escherichia coli multiple-knockout mutants reveals hidden reactions
in central carbon metabolism. Mol. Syst. Biol. 5, 306 (2009).
67. Lee,S.Y., Lee,D.Y. & Kim,T.Y. Systems biotechnology for
strain improvement. Trends Biotechnol. 23, 349358 (2005).
68. Park,J.H. & Lee,S.Y. Towards systems metabolic
engineering of microorganisms for amino acid production. Curr.
Opin. Biotechnol. 19, 454460 (2008).This is a review of using
systems biology methodologies for metabolic engineering
applications.
69. Caspeta,L. & Nielsen,J. Economic and environmental
impacts of microbial biodiesel. Nature Biotech. 31, 789793
(2013).
70. Yim,H. etal. Metabolic engineering of Escherichia coli for
direct production of 1,4-butanediol. Nature Chem. Biol. 7, 445452
(2011).
R E V I E W S
NATURE REVIEWS | GENETICS ADVANCE ONLINE PUBLICATION | 13
2014 Macmillan Publishers Limited. All rights reserved
-
71. Hatzimanikatis,V. etal. Exploring the diversity of complex
metabolic networks. Bioinformatics 21, 16031609 (2005).
72. Constantinou,L. & Gani,R. New group-contribution method
for estimating properties of pure compounds. AIChE J. 40, 16971710
(1994).
73. Khatri,P., Sirota,M. & Butte,A.J. Ten years of pathway
analysis: current approaches and outstanding challenges. PLoS
Comput. Biol. 8, e1002375 (2012).
74. Burgard,A.P., Pharkya,P. & Maranas,C.D. Optknock: a
bilevel programming framework for identifying gene knockout
strategies for microbial strain optimization. Biotechnol. Bioeng.
84, 647657 (2003).
75. Oberhardt,M.A., Yizhak,K. & Ruppin,E. Metabolically
re-modeling the drug pipeline. Curr. Opin. Pharmacol. 13, 778785
(2013).This is a review on using constraint-based modelling for
drug discovery.
76. Hsu,P.P. & Sabatini,D.M. Cancer cell metabolism: Warburg
and beyond. Cell 134, 703707 (2008).
77. Folger,O. etal. Predicting selective drug targets in cancer
through metabolic networks. Mol. Syst. Biol. 7, 501 (2011).
78. Frezza,C. etal. Haem oxygenase is synthetically lethal with
the tumour suppressor fumarate hydratase. Nature 477, 225228
(2011).
79. Jerby,L., Shlomi,T. & Ruppin,E. Computational
reconstruction of tissue-specific metabolic models: application to
human liver metabolism. Mol. Syst. Biol. 6, 401 (2010).
80. Kim,P.J. etal. Metabolite essentiality elucidates robustness
of Escherichia coli metabolism. Proc. Natl Acad. Sci. USA 104,
1363813642 (2007).
81. Kim,H.U. etal. Integrative genome-scale metabolic analysis
of Vibrio vulnificus for drug targeting and discovery. Mol. Syst.
Biol. 7, 460 (2011).
82. Brynildsen,M.P., Winkler,J.A., Spina,C.S., MacDonald,I.C.
& Collins,J.J. Potentiating antibacterial activity by
predictably enhancing endogenous microbial ROS production. Nature
Biotech. 31, 160165 (2013).
83. Lerman,J.A. etal. In silico method for modelling metabolism
and gene product expression at genome scale. Nature Commun. 3, 929
(2012).
84. Zhang,Y. etal. Three-dimensional structural view of the
central metabolic network of Thermotoga maritima. Science 325,
15441549 (2009).
85. Thiele,I., Fleming,R.M., Bordbar,A., Schellenberger,J. &
Palsson,B.O. Functional characterization of alternate optimal
solutions of Escherichia colis transcriptional and translational
machinery. Biophys. J. 98, 20722081 (2010).
86. Pramanik,J. & Keasling,J.D. Effect of Escherichia coli
biomass composition on central metabolic fluxes predicted by a
stoichiometric model. Biotechnol. Bioeng. 60, 230238 (1998).
87. Rodionova,I.A. etal. Diversity and versatility of the
Thermotoga maritima sugar kinome. J.Bacteriol. 194, 55525563
(2012).
88. OBrien,E.J., Lerman,J.A., Chang,R.L., Hyduke,D.R. &
Palsson,B.O. Genome-scale models of metabolism and gene expression
extend and refine growth phenotype prediction. Mol. Syst. Biol. 9,
693 (2013).
89. Chandrasekaran,S. & Price,N.D. Probabilistic integrative
modeling of genome-scale metabolic and regulatory networks in
Escherichia coli and Mycobacterium tuberculosis. Proc. Natl Acad.
Sci. USA 107, 1784517850 (2010).
90. Covert,M.W., Knight,E.M., Reed,J.L., Herrgard,M.J. &
Palsson,B.O. Integrating high-throughput and computational data
elucidates bacterial networks. Nature 429, 9296 (2004).
91. Chang,R.L. etal. Structural systems biology evaluation of
metabolic thermotolerance in Escherichia coli. Science 340,
12201223 (2013).
92. Gu,J. & Bourne,P.E. Structural bioinformatics
(Wiley-Blackwell, 2009).
93. Marr,A.G. & Ingraham,J.L. Effect of temperature on the
composition of fatty acids in Escherichia coli. J.Bacteriol. 84,
12601267 (1962).
94. Tenaillon,O. etal. The molecular diversity of adaptive
convergence. Science 335, 457461 (2012).
95. Mrters,P., Peres,Y., Schramm,O. & Werner,W. Brownian
motion (Cambridge Univ. Press, 2010).
96. Karr,J.R. etal. A whole-cell computational model predicts
phenotype from genotype. Cell 150, 389401 (2012).
97. Thiele,I. etal. A community-driven global reconstruction of
human metabolism. Nature Biotech. 31, 419425 (2013).
98. Borenstein,E. Computational systems biology and insilico
modeling of the human microbiome. Brief Bioinform. 13, 769780
(2012).
99. Levy,R. & Borenstein,E. Metabolic modeling of species
interaction in the human microbiome elucidates community-level
assembly rules. Proc. Natl Acad. Sci. USA 110, 1280412809
(2013).
100. Atkinson,D.E. The energy charge of the adenylate pool as a
regulatory parameter. Interaction with feedback modifiers.
Biochemistry 7, 40304034 (1968).
101. Weisz,P.B. Diffusion and chemical transformation. Science
179, 433440 (1973).
102. Reed,J.L. Shrinking the metabolic solution space using
experimental datasets. PLoS Comput. Biol. 8, e1002662 (2012).This
is a review of the potential constraints that have been placed on
CBMs.
103. Colijn,C. etal. Interpreting expression data with metabolic
flux models: predicting Mycobacterium tuberculosis mycolic acid
production. PLoS Comput. Biol. 5, e1000489 (2009).
104. Orth,J.D., Thiele,I. & Palsson,B.O. What is flux
balance analysis? Nature Biotech. 28, 245248 (2010).This paper
presents a primer on the theory, applications and software
toolboxes for FBA.
105. Mahadevan,R. & Schilling,C.H. The effects of alternate
optimal solutions in constraint-based genome-scale metabolic
models. Metab. Eng. 5, 264276 (2003).
106. Wilkinson,D.J. Stochastic modelling for quantitative
description of heterogeneous biological systems. Nature Rev. Genet.
10, 122133 (2009).
107. Steuer,R. Computational approaches to the topology,
stability and dynamics of metabolic networks. Phytochemistry 68,
21392151 (2007).
108. de Jong,H. Modeling and simulation of genetic regulatory
systems: a literature review. J.Comput. Biol. 9, 67103 (2002).
109. Friedman,N., Linial,M., Nachman,I. & Peer,D. Using
Bayesian networks to analyze expression data. J.Computat. Biol. 7,
601620 (2000).
110. Stephens,M. & Balding,D.J. Bayesian statistical methods
for genetic association studies. Nature Rev. Genet. 10, 681690
(2009).
111. Ideker,T. & Krogan,N.J. Differential network biology.
Mol. Syst. Biol. 8, 565 (2012).
112. Califano,A., Butte,A.J., Friend,S., Ideker,T. &
Schadt,E. Leveraging models of cell regulation and GWAS data in
integrative network-based association studies. Nature Genet. 44,
841847 (2012).
AcknowledgementsThe authors thank D. Zielinski, J. Lerman, N. E.
Lewis and H. Nagarajan for their criticisms and comments on the
manu-script. This work was supported by the US National Institutes
of Health grants GM068837 and GM057089, and by the Novo Nordisk
Foundation. Z.A.K. is supported through the US National Science
Foundation Graduate Research Fellowship (DGE-1144086).
Competing interests statementThe authors declare no competing
interests.
FURTHER INFORMATIONBiGG database:
http://bigg.ucsd.edu/Literature on COBRA (constraint-based
reconstruction and analysis) methods:
http://sbrg.ucsd.edu/cobra-methodsLiterature on model-driven
analysis: http://sbrg.ucsd.edu/cobra-predictionsOpenCOBRA project:
http://opencobra.sourceforge.net/
ALL LINKS ARE ACTIVE IN THE ONLINE PDF
R E V I E W S
14 | ADVANCE ONLINE PUBLICATION
www.nature.com/reviews/genetics
2014 Macmillan Publishers Limited. All rights reserved
http://bigg.ucsd.edu/http://sbrg.ucsd.edu/cobra-methodshttp://sbrg.ucsd.edu/cobra-predictionshttp://sbrg.ucsd.edu/cobra-predictionshttp://opencobra.sourceforge.net/
Abstract | The prediction of cellular function from a genotype
is a fundamental goal in biology. For metabolism, constraint-based
modelling methods systematize biochemical, genetic and genomic
knowledge into a mathematical framework that enables a
mechaniFoundational developmentsTable 1 | A comparison of modelling
and analysis techniques for high-throughput dataRefining
objectivesBox 1 | Constraint-based modelling: motivation and
definitionBox 2 | Constraint-based modelling: introduction to
methods for analysisContextualizing omic dataFigure 1 | The
multiple uses of high-throughput data in constraint-based
models.Constraint-based modelling can be used to interpret and
augment omic data sets by using an underlying cellular network that
has been biochemically validated. Metabolites are Characterizing
interaction networksTargeted expansion of metabolic knowledgeFigure
2 | Predictive case studies in understanding underlying principles
of interaction networks.Many network types are used to represent
cellular behaviour. Recent studies have compared the properties of
interaction networks against constraint-based moDesigning metabolic
phenotypesDiscovering drug targetsFigure 3 | Predictive case
studies in metabolic engineering and drug
targeting.Constraint-based models have been used for answering
important questions in translational research. a | One study70 used
multiple computational and experimental tools to desigCoupling with
other cellular processesFigure 4 | Expanding predictive scope
through integrative modelling.The predictive scope of
constraint-based modelling has been extended beyond metabolism
either by explicitly accounting for non-metabolic components in the
constraint-based modelling apprConclusions