Top Banner
10 th World Congress on Structural and Multidisciplinary Optimization May 19 - 24, 2013, Orlando, Florida, USA On the Use of MPCCs in Combined Topological and Parametric Design of Genetic Regulatory Circuits Tinghao Guo , James T. Allison Department of Industrial and Enterprise Systems Engineering University of Illinois at Urbana-Champaign 117 Transportation Building, 104 S. Mathews Ave. Urbana, Illinois, 61801, USA {guo32,jtalliso}@illinois.edu 1. Abstract Synthetic biology is a field that involves design of artificial biological devices and systems with the objec- tive of achieving specific biological functions. While a large number of synthetic biological systems have been created over the years, current design methods are limited in the complexity of systems that can be developed. Here we focus on the design of genetic regulatory circuits, and present a new design method based on direct transcription (DT)—a technique for dynamic system optimization—and mathematical programs with complementarity constraints (MPCCs). This method simultaneously optimizes circuit topology and continuous parameters, with circuit models based on the Michaelis-Menten equations. The case study involves design for adaptation. The objective is to minimize change in steady-state output af- ter a change in input. A sensitivity constraint is imposed that ensures a change in input can be detected. The problem is solved using both DT and single-shooting (i.e., nested simulation) for comparison. The MPCC formulation enables solution of four-node problems, an improvement upon existing approaches and a step toward larger systems. In addition to the simultaneous solution approaches described above, a nested approach was also investigated where an outer loop solves the discrete topology optimization problem (avoiding complementarity constraints), and an inner loop solves the continuous parameter opti- mization problem for each candidate topology. The simultaneous approach based on DT and the MPCC formulation is shown to yield robust network topological designs that achieve adaptation. 2. Keywords: synthetic biology, genetic regulatory circuit design, topology optimization, direct tran- scription, nonlinear programming, MPCCs. 3. Introduction 3.1. Synthetic Biology Synthetic biology is a relatively new research area that involves the design of synthetical biological devices and systems. These systems have potential for a wide variety of applications, such as biomedical treatments. A large number of genetic elements and devices have been created over the years. Simple early examples of synthetic genetic regulatory circuits include a genetic toggle switch and an oscillating genetic circuit developed by Gardner et al. and Elowitz et al. [1,2]. Their work involved a combination of mathematical models and experiments in domain of genetic engineering [3]. More comprehensive reviews of can be found in Refs. [4–8]. Computational tools have also been developed for the design of gene circuits. Knight [9] proposed BioBrick, a tool that is based on the assembly of compatible biological components (and that is amendable to optimization and automation) [10]. Clotho’s Eugene is a domain specific language that analyzes and improves synthetic biological designs. This specific programming language permits the use of rule-based constraints, preventing faulty construction and reducing development time and cost [11]. Genetic Engineering of Living Cells (GEC) is a formal programming language that allows logical interactions between proteins and genes, but this approach utilizes exhaustive search, so is not appropriate for large systems [12]. Beal et al. proposed Proto, a spatial computing language developed that supports the development of gene circuits and the construction of tissues and organs. Proto involves a compiler that maps a high-level behavioral description to an abstract genetic network and then optimizes this gene network [13,14]. Established strategies from other domains of engineering have been extended to the design gene circuits. For instance, Yi et al. used integral feedback control and dynamical systems theory to study the robustness of perfect adaptation in bacterial chemotaxis [15]. Stelling et al. showed that positive and 1
12

On the Use of MPCCs in Combined Topological and Parametric ... › publications › Guo13a.pdf · Synthetic biology is a eld that involves design of arti cial biological devices and

Jun 10, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: On the Use of MPCCs in Combined Topological and Parametric ... › publications › Guo13a.pdf · Synthetic biology is a eld that involves design of arti cial biological devices and

10th World Congress on Structural and Multidisciplinary OptimizationMay 19 - 24, 2013, Orlando, Florida, USA

On the Use of MPCCs in Combined Topological and Parametric Design ofGenetic Regulatory Circuits

Tinghao Guo, James T. Allison

Department of Industrial and Enterprise Systems Engineering

University of Illinois at Urbana-Champaign

117 Transportation Building, 104 S. Mathews Ave.

Urbana, Illinois, 61801, USA

{guo32,jtalliso}@illinois.edu

1. AbstractSynthetic biology is a field that involves design of artificial biological devices and systems with the objec-tive of achieving specific biological functions. While a large number of synthetic biological systems havebeen created over the years, current design methods are limited in the complexity of systems that can bedeveloped. Here we focus on the design of genetic regulatory circuits, and present a new design methodbased on direct transcription (DT)—a technique for dynamic system optimization—and mathematicalprograms with complementarity constraints (MPCCs). This method simultaneously optimizes circuittopology and continuous parameters, with circuit models based on the Michaelis-Menten equations. Thecase study involves design for adaptation. The objective is to minimize change in steady-state output af-ter a change in input. A sensitivity constraint is imposed that ensures a change in input can be detected.The problem is solved using both DT and single-shooting (i.e., nested simulation) for comparison. TheMPCC formulation enables solution of four-node problems, an improvement upon existing approachesand a step toward larger systems. In addition to the simultaneous solution approaches described above,a nested approach was also investigated where an outer loop solves the discrete topology optimizationproblem (avoiding complementarity constraints), and an inner loop solves the continuous parameter opti-mization problem for each candidate topology. The simultaneous approach based on DT and the MPCCformulation is shown to yield robust network topological designs that achieve adaptation.

2. Keywords: synthetic biology, genetic regulatory circuit design, topology optimization, direct tran-scription, nonlinear programming, MPCCs.

3. Introduction3.1. Synthetic Biology

Synthetic biology is a relatively new research area that involves the design of synthetical biologicaldevices and systems. These systems have potential for a wide variety of applications, such as biomedicaltreatments. A large number of genetic elements and devices have been created over the years. Simpleearly examples of synthetic genetic regulatory circuits include a genetic toggle switch and an oscillatinggenetic circuit developed by Gardner et al. and Elowitz et al. [1,2]. Their work involved a combination ofmathematical models and experiments in domain of genetic engineering [3]. More comprehensive reviewsof can be found in Refs. [4–8]. Computational tools have also been developed for the design of gene circuits.Knight [9] proposed BioBrick, a tool that is based on the assembly of compatible biological components(and that is amendable to optimization and automation) [10]. Clotho’s Eugene is a domain specificlanguage that analyzes and improves synthetic biological designs. This specific programming languagepermits the use of rule-based constraints, preventing faulty construction and reducing development timeand cost [11]. Genetic Engineering of Living Cells (GEC) is a formal programming language that allowslogical interactions between proteins and genes, but this approach utilizes exhaustive search, so is notappropriate for large systems [12]. Beal et al. proposed Proto, a spatial computing language developedthat supports the development of gene circuits and the construction of tissues and organs. Proto involves acompiler that maps a high-level behavioral description to an abstract genetic network and then optimizesthis gene network [13,14].

Established strategies from other domains of engineering have been extended to the design genecircuits. For instance, Yi et al. used integral feedback control and dynamical systems theory to studythe robustness of perfect adaptation in bacterial chemotaxis [15]. Stelling et al. showed that positive and

1

Page 2: On the Use of MPCCs in Combined Topological and Parametric ... › publications › Guo13a.pdf · Synthetic biology is a eld that involves design of arti cial biological devices and

negative feedback control can help enhance sensitivity and stability of the genetic circuits [16]. Rao etal. proposed a computational model for Bacillus subtilis chemotaxis and showed that the feedback controlstructure was an evolutionary conserved property [17]. Beal et al. used a multi-input sinusoidal transfercurve to describe the behavior of gene circuit [14]. Yang et al. focused on biphasic response in the chemoattractant using positive feedback control in Dictyostelium cells [18].

Mathematical modeling, simulation and experimental data are also important in synthetic biology.For a specific desired circuit behavior, constraints and dynamical behavior in genetic circuits may beestablished so that network typologies and kinetic parameters can be identified [19]. Braun et al. used atranscription cascade and a pulse generating network to study a statistical parameter fitting algorithm[20]. Dunlop et al. created a model for microbial biofuel production to increase cell viability and biofuelyields [21]. Batt et al. presented the analysis of a class of uncertain piecewise-multiaffine differentialequation models, and demonstrated a method for tuning a synthetic transcriptional cascade created inEscherichia coli [22].

Despite significant achievements, artificial gene circuit design remains a challenging task. Existingdesign methods are limited, particularly in their ability to handle large-scale genetic circuits. The di-mension of successfully implemented genetic circuits, measured by the number of regulatory regions, hasplateaued at six for several years [4]. While these simple systems can perform simple biological functions,they are not yet sophisticated enough to provide practical benefits. While some optimization approacheshave been explored [23], a comprehensive optimization-based approach has not yet been investigated.Realizing the vision of tackling grand challenges such as bio-medical therapies with synthetic biologywill require new design principles and methodologies that can scale up to larger system designs (� 6regulatory regions). In this article we propose an optimization-based method that is one important steptoward the realization of gene circuit designs of practical importance.

3.2. Direct TranscriptionDirect transcription (DT) is a method for dynamic system optimization. It has been applied widelyto trajectory optimization, optimal control, and parameter estimation problems in a variety of domains[24–28]. We apply it here for the first time to gene circuit design to develop circuits with optimal dynamicperformance. Consider the following continuous dynamic problem:

minξ(t),u(t),tF ,p

J = φ(ξ(t),u(t), tF ,p) (1a)

s.t. ξ = fd(ξ(t),u(t), t,p) (1b)

0 = fa(ξ(t),u(t), t,p) (1c)

0 = ψ(ξ(tF ),u(tF ), tF ,p). (1d)

Equation (1a) defines the cost function, where ξ(t) is the state vector and u(t) is the control input,both of which are functions of time t ∈ [t0, tF ]. p represents time-independent parameters. Constraints(1b) and (1d) are the differential and algebraic equations that govern the dynamics of the system. Thesetwo sets of constraints combined form a system of differential algebraic equations (DAEs) [29]. Thesystem may have initial conditions ξ(t0) = ξ0, as well as a boundary condition at the final time tF(Eqn. (1d)).

A conventional solution to this problem involves application of first order optimality conditions (e.g.,Pontryagin’s Maximum Principle (PMP) [30, 31]) that produce a boundary value problem (BVP). Insimple cases a closed-form solution may be derived; in more general cases this BVP must be solvednumerically. In addition, system Hamiltonian derivatives must be derived, which is difficult in somecases.

A solution to the above problem produces an optimal trajectory u∗(t) (open-loop control), as wellas p∗ and the corresponding state trajectories and final time. Feedback control can be implementedby replacing u(t) with control design variables xc that parameterize a feedback controller; xc can beoptimized together with the time-independent parameters p.

minξ(t),xc,p

J = φ(ξ(t),u(t),p, tF ) (2)

A closed-form solution exists for linear dynamic systems. For nonlinear systems, a PMP-based ap-proach may work, but if not other solution methods should be explored. One well-used solution method

2

Page 3: On the Use of MPCCs in Combined Topological and Parametric ... › publications › Guo13a.pdf · Synthetic biology is a eld that involves design of arti cial biological devices and

is a nested simulation approach, where the DAE (or ODE if no algebraic constraints exist) is solved us-ing numerical simulation for every design alternative tested by the optimization algorithm. This nestedsimulation method is often referred to as single-shooting. Consider an optimal feedback control problemwhere tF and p are fixed, and no algebraic or boundary constraints are present:

minxc

J = φ(Ξ,xc). (3)

Here Ξ denotes discretized state variables, which are solved for using numerical simulation for every newcandidate control design xc. This problem may be solved using standard algorithms for nonlinear pro-gramming (NLP) [32]. While straightforward and intuitive, this approach suffers from several drawbacks.Simulation can lead to cost and constraint functions that are non-smooth or arithmetically consisten-t [28]. Numerical integration of DAEs may become very time-consuming for large-scale problems [24]. Inaddition, single-shooting cannot handle open loop instabilities [33,34], and solutions may be numericallyunstable for highly nonlinear or stiff systems.

Multiple shooting addresses instabilities by dividing the simulation into nT smaller time segments.The system is simulated within each of these segments independently, and defect constraints are includedin the NLP formulation to ensure consistency and state continuity between segments at optimizationconvergence. If the size of time segments is reduced to the limiting value of the individual time step sizes(i.e., one time step per time segment), the result is the direct transcription method. No simulation isrequired. Satisfaction of the defect constraints ensures approximate satisfaction of the underlying DAEs.More specifically, the time domain is divided into nt segments or intervals t0 < t1 < . . . < tnt

= tF . Thediscretized state (Ξ) and control (U) trajectories are optimization variables in the DT formulation.

The exact form of the defect constraints depends on the collocation method used to discretize thesystem equations. For example, the first order Runge-Kutta method (forward Euler Method) is given by:

ξi = ξi−1 + hifd(ξi−1,xc(ti−1), ti−1,p). (4)

where hi is the step size at time ti. Higher order Runge-Kutta methods, such as the implicit trapezoidalmethod, can be used to generate defect constraints:

ζi(Ξ,xc,p) = ξi − ξi−1 −hi2

(fd(ξi−1,xc, ti−1,p) + fd(ξi,xc, ti,p)) , (5)

While DT results in a high-dimension optimization problem, its sparse problem structure supports efficientsolution [28,35]. In this article we will demonstrate a new DT solution for the gene circuit design problem.

3.3. Complementarity ConstraintsComplementarity constraints (CCs) are useful in many applications such as parameter estimation [36],metabolic flux networks [37], truss topology design [38], and hybrid dynamic systems [39]. We use CCshere to manage gene circuit topology decisions.

Complementary refers to a relationship between variables where either one (or both) are at theboundary. Mathematical program with complementary constraint takes the following form [40]:

minx,y,z

f(x,y, z)

h(x,y, z) = 0 (6)

0 ≤ x ⊥ y ≥ 0

The complementarity constraints are given in the last line, and imply that either xi = 0 or yi = 0, andx ≥ 0,y ≥ 0. While complementarity constraints (CCs) can be used to represent discrete decisions, theresulting nonlinear programming problems do not satisfy regularity requirements, such as linear indepen-dence constraint qualification (LICQ) or even the weaker Mangasarian-Fromovitz constraint qualification(MFCQ). As a result, multipliers may be non-unique or bounded [40, 41]. Mathematical programs withcomplementarity constraints (MPCCs) are therefore difficult to solve using standard NLP solvers.

Several equivalent reformulations have been investigated using relaxed complementarity constraints[40,42,43]. One approach is to approximate CCs using a nonnegative scalar t (Reg(t): xiyi ≤ t,x ≥ 0,y ≥0). Alternatively complmentarity condition can be approximated by a single constraint (RegComp(t):xT y ≤ t,x ≥ 0,y ≥ 0) or replaced by equalities (RegEq(t) : xiyi = t,x ≥ 0,y ≥ 0). These MPCCreformulations exhibit different local convergence properties under mild conditions. In addition, an exact

3

Page 4: On the Use of MPCCs in Combined Topological and Parametric ... › publications › Guo13a.pdf · Synthetic biology is a eld that involves design of arti cial biological devices and

l1 penalty function may also be used; the complementarity constraint is enforced using a penalty termin the objective function (PF (ρ): min f(x,y, z) + ρxTy,x ≥ 0,y ≥ 0). These reformulations may besolved using standard optimization algorithms.4. Design ProblemMany efforts have concentrated on achieving particular biological functions in gene circuits, as well asextracting general design principles. Shen-Orr and Milo [44, 45] defined topological network motifs toexplore structural design principles, and identified three highly significant motifs. Each of these motifshas a specific function in determining gene expression. Motifs that correspond to specific biologicalfunctions have also been studied, such as oscillations [46, 47] and adaptation [48–50]. The relationshipbetween robustness and topology has also been studied [51].

Ma et al. carried out a series of analyses on three-node gene circuits and successfully identified twocore topologies that produce adaptive behavior [52]. A circuit exhibits adaptation if its output returnsto original levels after an input disturbance. Ma et al. used exhaustive enumeration to explore allpossible three-node network topologies. Each topology was tested for adaptive properties by simulatingthe circuit many times with a different parameter value combination (10,000 parameter samples). Thisnested exhaustive enumeration approach carries with it extreme computational expense, and the size ofcircuits that can be explored in this manner therefore is extremely limited.

Our objective here is to extend existing engineering design optimization techniques in a way that sup-ports the identification of gene circuit designs that exhibit desired behavior with much less computationaleffort. Furthermore, these efficiency improvements will also likely increase the dimension of synthetic genecircuits that can be designed and implemented successfully. In this article we present the results of a firststrategy for applying optimization to the combined topology and continuous variable gene circuit designproblem. DT is employed to solve for optimal continuous parameters, and complementarity constraintsare used to manage topological decisions. Other optimization-based strategies are also being investigatedin ongoing work.

4.1 Problem FormulationFigure 1 illustrates one possible circuit topology, where A receives input I and node C produces theoutput. A third node B acts as a regulator. Figure 2 illustrates a step input change used in this study,and a representative output response. Here we seek circuit designs with maximally adaptive behavior,i.e., the error E between the steady-state output before and after the step input change is minimized.Equivalently, the precision P—the inverse of E—is to be maximized.

A

B

C

Input

Output

1-BActive Inactive

BC

1-C CB

ActiveInactive

Figure 1: An example three-node circuit topology

Input

Output

Sensitivty

Precision

Time

Figure 2: Input and output time response

In addition, it is important to maintain an acceptable level of sensitivity (i.e., the largest relativeoutput change compared to the relative input change). Here we require sensitivity to be greater than orequal to one. The resulting optimization formulation, with normalized error and sensitivity metrics, is:

minx

E =

∣∣∣∣ (O2 −O1)/O1

(I2 − I2)/I1

∣∣∣∣ , subject to:

∣∣∣∣ (Opeak −O1)/O1

(I2 − I1)/I1

∣∣∣∣ ≥ 1, (7)

where x represents a combination of topological and continuous design variables.

4

Page 5: On the Use of MPCCs in Combined Topological and Parametric ... › publications › Guo13a.pdf · Synthetic biology is a eld that involves design of arti cial biological devices and

4.2 Dynamic System ModelAs performed in Ref. [52], the dynamic output response O(t) of the gene circuit is obtained by usingnumerical simulation to solve the Michaelis-Menten equations that correspond to the candidate genecircuit design [53]. Each node in this model has a normalized enzyme concentration that can be convertedbetween two forms (active and inactive). This conversion is governed by active enzyme levels producedby other nodes in the network. For example, in Fig. 1 the arrow connecting node B to node C representsa positive regulation of node C by node B. This means that increased production of the enzyme that isemitted by active enzyme at node B will convert the enzyme levels at node C from its inactive 1−C toactive form C. In other words, increased enzyme production by active enzyme at node B will increasethe production of the active enzyme level at node C.

A negative regulation is also possible. In Fig. 1, the connection between node C and B with a circleat node B represents a negative regulation (i.e., C −•B). This means that the active enzyme producedby C catalyzes the transition of active form B to inactive form 1 − B at node B. Figure 3 illustratesthese relations between node B and C in greater detail.

BInactive form Active form

1-C C

(a) Positive regulation: B to C

CInactive form Active form

1-B B

(b) Negative regulation: C to B

Figure 3: Inner conversion between node B and C

In Fig. 3(a) C and 1− C represent concentrations of node C’s active and inactive forms at node C,respectively. The arc with the symbol B above it indicates that the active enzyme produced by nodeB causes the transition of enzyme levels at node C from its inactive to active form. The differentialequation that models this transition is given at the bottom of Fig. 3(a). Here kBC is the catalytic rateconstant, and KBC is the Michaelis-Menten constant. Figure 3(b) outlines the negative regulation of Con B.

Often the basal (minimum) enzyme production level is non-zero. If a node has only positive (negative)incoming regulations, it is assumed that a basal enzyme level would deactivate (activate) this node. Herewe use notations Ei and Fi to represent activating and deactivating basal enzymes receptively, wherei ∈ {A,B,C}. The basal enzyme in this model is set to be a constant value (0.5), and the correspondingrate equations are illustrated in Fig. 4.

An additional regulation type is the self-loop regulation (see the positive self-loop for node B in Fig.1). In this study we assume for simplicity that self-loops do not occur. Because multiple regulations ona node are additive, the complete rate equation for node T is:

dT

dt=∑i

XikXiT1− T

(1− T +KXiT )−∑i

Yik′

YiT

T

(T +K′YiT

). (8)

where T ∈ {A,B,C}, and Xi is the ith basal activating enzyme (positive regulator) of node T , andbelongs to the set {A,B,C,EB , EC}. Yi is the ith basal deactivating enzymes (negative regulators) ofnode T , and is a member of the set {A,B,C, FA, FB , FC}. The particular values of the catalytic rateconstants (k′cats) and the Michaelis-Menton constants (K ′ms) of the enzymes have the ranges 0 ≤ kij ≤ 10and 0 ≤ Kij ≤ 100, where i and j refer to particular nodes and basal activating/deactivating enzyme.

4.3 MPCC FormulationComplementarity constraints (CCs) are used here to model the existence regulations between differentnodes. CCs are a natural problem formulation as the link between two nodes may only have positive,negative, or no regulation. To clarify, node C cannot exert both positive and negative regulation on nodeB in Fig. 1. This requirement can be formulated as a complementarity constraint. For example, such aconstraint for the three-node network would be kijk

ij = 0, ∀{i, j} ∈ {A,B,C} × {A,B,C}.Figure 5 illustrates how topology decisions may be modeled using CCs. At most one of kBC and

k′BC can exist between node B and C. This is enforced by the constraint kBCk′BC = 0. Likewise, the

constraints kBCkECC = 0 and k′FCCk′BC = 0 ensure that deactivating or activating enzyme is available

5

Page 6: On the Use of MPCCs in Combined Topological and Parametric ... › publications › Guo13a.pdf · Synthetic biology is a eld that involves design of arti cial biological devices and

B

C

1-BActive Inactive

BC

1-C CB

ActiveInactive

Figure 4: Regulations between node B and C

B

C

Figure 5: Complementarity Constraints

when there are only positive or negative linkages coming into node C. Because the circuit depends onvalues of parameters k and K, the states and the parameters here are considered as optimization variablesin the model. If a parameter is zero, its corresponding regulation does not exist; if it is nonzero, thecorresponding regulation does exist.

The MPCC formulation for the dynamic three-node gene circuit design problem is:

minA,B,C,k,K

E =

∣∣∣∣ (O2 −O1)/O1

(I2 − I1)/I1

∣∣∣∣s.t.

∣∣∣∣ (Opeak −O1)/O1

(I2 − I1)/I1

∣∣∣∣ ≥ 1

dA

dt= IkIA

(1−A)

(1−A) + KIA+ BkBA

(1−A)

(1−A) + KBA+ CkCA

(1−A)

(1−A) + KCA

−Bk′BA

A

A + K′BA

− Ck′CA

A

A + K′CA

− FAk′FAA

A

A + K′FAA

dB

dt= AkAB

(1−B)

(1−B) + KAB+ CkCB

(1−B)

(1−B) + KCB+ EBkEBB

(1−B)

(1−B) + KEBB(9)

−Ak′AB

B

B + K′AB

− Ck′CB

B

B + K′CB

− FBk′FBB

B

B + K′FBB

dC

dt= AkAC

(1− C)

(1− C) + KAC+ BkBC

(1− C)

(1− C) + KBC+ ECkECC

(1− C)

(1− C) + KECC

−Ak′AC

C

C + K′AC

−Bk′BC

C

C + K′BC

− FCk′FCC

C

C + K′FCC

0 ≤ k1 ⊥ k2 ≥ 0, 0 ≤ k ≤ 10, 0 ≤ K ≤ 10

where k1 and k2 are defined for convince for use with the CCs, and:

k1 = (kBA, kCA, k′BA, k

′CA, kAB , kCB , kAB , kCB , k

′AB , k

′CB , kAC , kBC , kAC , kBC , k

′AC , k

′BC)

k2 = (k′BA, k

′CA, k

′FAA, k

′FAA, k

′AB , k

′CB , kEBB , kEBB , k

′FBB , k

′FBB , k

′AC , k

′BC , kECC , kECC , k

′FCC , k

′FCC)

k = (kIA, kBA, kCA, k′BA, k

′CA, k

′FAA, kAB , kCB , kEBB , k

′AB , k

′CB , k

′FBB , kAC , kBC , kECC , k

′AC , k

′BC , k

′FCC)

K = (KIA,KBA,KCA,K′BA,K

′CA,K

′FAA,KAB ,KCB ,KEBB ,K

′AB ,K

′CB ,K

′FBB ,KAC ,KBC ,KECC ,K

′AC ,

K′BC ,K

′FCC).

The input term IkIA(1−A)(1−A) is added to node A. The differential equations were formed into defect

constraints ζ(·) using the trapezoidal collocation method. The complementarity constraints are managedhere using a penalty term [39,40], and the resulting DT formulation is:

6

Page 7: On the Use of MPCCs in Combined Topological and Parametric ... › publications › Guo13a.pdf · Synthetic biology is a eld that involves design of arti cial biological devices and

minΞ,k,K

E =

((O2 −O1)/O1

(I2 − I1)/I1

)2

+1

2

(ρk1

Tk2

)2s.t.

∣∣∣∣ (Opeak −O1)/O1

(I2 − I1)/I1

∣∣∣∣ ≥ 1

ζi(Ξ,k,K) = 0 (10)

0 ≤ k1,k2,k ≤ 10, 0 ≤ K ≤ 100

where i = 1, 2, . . . , nt − 1, nt

5. Numerical ResultsThe MPCC solution method was applied to both three and four-node gene circuit design problems. Thesuccessful solution of the four-node problem is an important development; previous solution strategieshave not been able to optimize gene circuit topology design problems this large.

5.1. Three-Node Gene Circuit Design ProblemProblem (10) was solved in two different ways, both using the optimization algorithm KNITRO [54]. First,a simultaneous direct transcription approach was used where the optimization problem and simulationproblem were solved at the same time. The second approach utilized single-shooting, where the statevariables Ξ were solved for each candidate design [k,K] using a multistep, implicit method for forwardsimulation∗. This second method is a nested simulation strategy, since simulation is nested within theoptimization problem.

A multi-start method was used in each case to improve the probability of locating a global optimum.The simultaneous direct transcription approach exhibited significantly better computational efficiencythan the single-shooting method. 20.83% of starting points using the simultaneous approach producedtopologies with P∗ ≥ 1†, while only 0.83% do when using the single-shooting method. The best result ofthe simultaneous approach (P∗ = 1.6072, S = 1.016) had better precision than the best feasible solutionobtained using single-shooting (P∗ = 1.4584, S = 1.7963). Observe that the sensitivity constraint inthe single-shooting solution is inactive (i.e., > 1), indicating an opportunity for improvement that thesingle-shooting method was unable to identify. The optimal parameters are reported below. Previousstudies have identified advantages to traversing state and design spaces simultaneously [25, 35]. A moredirect path to the solution can be traced, and often better solutions can be identified. The optimal designfound was:

knested∗ = (5.35, 0.20, 0.95, 0, 0, 2.51, 1.02, 0.09, 0, 0, 0, 0.66, 1.18, 0, 0, 0, 2.3, 0)

Knested∗ = (2.94, 2.35, 2.21, 1.87, 0.81, 0.07, 0.73, 2.43, 1.86, 4.36, 2.17, 1.16, 0, 1.85, 3.51, 4.13, 0.21, 4.81)

kDT∗ = (2.66, 0.01, 0, 0, 0, 0.57, 0.51, 0.96, 0, 0, 0, 0.43, 0.67, 0, 0, 0, 0.75, 0)

KDT∗ = (1.94, 2.46, 1.56, 0.09, 0.35, 0.02, 0.59, 2.34, 1.26, 0.85, 0.85, 0.13, 0.18, 1.64, 0.70, 2.08, 0.05, 1.69).

Figure 6 illustrates the circuit topologies and system responses produced by both solution approaches.Observe that the system is simulated long enough (40 sec.) for the output to achieve steady-state condi-tions before the input is perturbed. The peak occurs approximately 5 sec. after the input disturbance.Both responses shown here exhibit adaptation, i.e., outputs return to near their original values. Therelative height of the output peak after t = 40 sec. corresponds to the sensitivity metric.

5.2. Four-Node Gene Circuit Design ProblemIn this section we will extend our investigation to four-node gene circuits, a step towards larger systems.Notice that 36 parameters must be optimized in the three-node case. The number of optimizationvariables increases significantly for the four-nodes case. Here we assume that node A receives the inputsignal and node D produces the circuit output.

In addition to the simultaneous direct transcription solution method, we also investigated the use ofa nested optimization strategy. Nested optimization is distinct from the nested simulation, or single-shooting, method described above. The topology design is optimized in the outer optimization loop,and for each topology proposed in the outer loop, an inner-loop problem is solved to obtain the optimalcontinuous parameter values.

∗MatlabR© ode15s solver.†Here the precision reported is scaled as log (E−1).

7

Page 8: On the Use of MPCCs in Combined Topological and Parametric ... › publications › Guo13a.pdf · Synthetic biology is a eld that involves design of arti cial biological devices and

A

B

C

Input

Output

(a) Topology resulting from the single-shooting solution method

0 10 20 30 40 50 60 70 80 90 1000

0.05

0.1

0.15

0.2

0.25

Time

Con

cent

ratio

n

ABC

(b) System response for the single-shooting solution

A

B

C

Input

Output

(c) Topology resulting from the directtranscription solution method

0 10 20 30 40 50 60 70 80 90 1000

0.1

0.2

0.3

0.4

0.5

0.6

0.7

Time

Con

cent

ratio

n

ABC

(d) System response for the direct transcription solu-tion

Figure 6: Resulting topological designs and system responses for both solution approaches

Here xij are discrete design variables that represent circuit topology. Each xij can take one of thefollowing values:

xij =

+ 1 a positive regulation link from node i to node j

0 no regulation between nodes i and j

− 1 a negative regulation link from node i to node j

(11)

where {i, j} ∈ {A,B,C,D} × {A,B,C,D}. If i = j, it implies that there is either a background deacti-vating or activating enzyme in node i. The outer-loop problem formulation is:

minx

E∗(x)

s.t. −2 ≤ xAA + xBA + xCA + xDA ≤ 2

−2 ≤ xAB + xBB + xCB + xDB ≤ 2 (12)

−2 ≤ xAC + xBC + xCC + xDC ≤ 2

−2 ≤ xDA + xDB + xDC + xDD ≤ 2

where xij = {−1, 0, 1} and E∗ is the minimum squared error identified by the inner loop for a given

8

Page 9: On the Use of MPCCs in Combined Topological and Parametric ... › publications › Guo13a.pdf · Synthetic biology is a eld that involves design of arti cial biological devices and

topology x. The inner loop problem formulation is:

mink,K,Ξ

E =

((O2 −O1)/O1

(I2 − I1)/I1

)2

s.t.

∣∣∣∣ (Opeak −O1)/O1

(I2 − I1)/I1

∣∣∣∣ ≥ 1

ζi(Ξ,k,K) = 0 (13)

0 ≤ k ≤ 10, 0 ≤ K ≤ 100

where i = 1, 2, . . . , nt − 1, nt

The inner loop problem is solved using a direct transcription approach. Complementarity constraintsare not needed here since topology is managed by the outer loop (a nonlinear integer program). Branchand bound is a general algorithm for solving mixed integer programs. This approach, however, is timeconsuming because the inner loop must be solved for each candidate topology. Figure 7 illustratesthe optimal topology and corresponding system response. The optimal scaled precision in this case isP∗ = 1.2458, and the sensitivity constraint is active (S = 1.0008). The optimal parameters are:

kDT∗ = (1.16, 0, 0, 0, 0.36, 0.33, 1.26, 0, 0.28, 0, 0.22, 0, 0, 0.26, 0, 0,

0.07, 0.04, 0.24, 0, 0, 0, 0, 0.24, 0.46, 0.44, 0, 0, 0, 0, 0.57, 0)

KDT∗ = (0.24, 2.38, 1.79, 2.51, 0.19, 1.4, 0.2, 1.28, 0.18, 1.59, 0.73, 0.60, 1.05, 0.02, 1.15, 0.62,

0.15, 1.43, 0.06, 1.19, 0.86, 0.13, 0.29, 0.16, 0.09, 1.77, 1.75, 1.20, 0.85, 0.32, 0, 0.02).

AB

D

Input

Output

C

(a) 4-node Topology for DT

0 10 20 30 40 50 60 70 80 90 1000

0.1

0.2

0.3

0.4

0.5

0.6

0.7

Time

Con

cent

ratio

n

ABCD

(b) System response for DT

Figure 7: 4-node topology and system response for simultaneous approach

6. DiscussionIn Ma’s work [52], all possible three-node topologies (16,038 in total) were enumerated to identify thosecapable of adaptation. Even if optimization was used in an inner loop to identify optimal parameters,enumerating topologies is more computationally intensive than the nested optimization approach withbranch and bound. Instead of optimization, Ma tested 10,000 parameter combinations for each topology,requiring a total of 1.6 ×108 distinct gene circuit simulations. Extending this approach to four nodeswould be impractical, and we see optimization-based approaches as a means for tackling increasinglymore complex gene circuit design problems. The enumerative exploration, however, did reveal severalinsights, such as the ability of negative feedback loops and an incoherent feedforward loops to achieveadaptation.

The solutions obtained here confirm the importance of these motifs in achieving adaptation. Forexample, the path A→ B −•C → A in Fig. 6(a) is a negative feedback loop, and the path A→ B −•Cand A→ C in Fig. 6(c) is an incoherent feedforward loop.

7. ConclusionDynamic optimization problems are becoming increasingly important across a variety of domains. Here

9

Page 10: On the Use of MPCCs in Combined Topological and Parametric ... › publications › Guo13a.pdf · Synthetic biology is a eld that involves design of arti cial biological devices and

we have extended optimization-based techniques—methods that are well-established in conventional engi-neering design—to the design of genetic regulatory circuits, with the objective of increasing the complex-ity of circuits that can be designed successfully. Direct transcription with complementarity constraintswas applied successfully to gene circuit design. This method performs simultaneous simulation andoptimization, and produced better results than nested simulation and nested optimization with lowercomputational expense. We see this as a first step toward using optimization-based methods for genecircuit design, and envision it as a means for working toward the design of high-dimension synthetic genecircuits of practical importance.

8. References

[1] T.S. Gardner, C.R. Cantor, and J.J. Collins. Construction of a Genetic Toggle Switch in Escherichia coli.Nature, 403:339–342, 2000.

[2] M.B. Elowitz and S. Leibler. A Synthetic Oscillatory Network of Transcriptional Regulators. Nature, 403:335–338, 2000.

[3] E.M. Judd, M.T. Laub, and H.H. McAdams. Toggles and Oscillators: New Genetic Circuit Designs. BioEs-says, 22(6):507–509, June 2000.

[4] P. Purnick and R. Weiss. The Second Wave of Synthetic Biology: From Modules to Systems. Nature ReviewsMolecular Cell Biology, 10(6):410–422, June 2009.

[5] J. Hasty, D. McMillen, and J.J. Collins. Engineered Gene Circuits. Nature, 420(6912):224–230, November2002.

[6] E. Andrianantoandro, S. Basu, D.K. Karig, and R. Weiss. Synthetic Biology: New Engineering Rules for anEmerging Discipline. Molecular Systems Biology, 2:2006.0028, January 2006.

[7] T.K. Lu, A.S. Khalil, and J.J Collins. Next-generation Synthetic Gene Networks. Nature Biotechnology,27(12):1139–1150, December 2009.

[8] W. Weber and M. Fussenegger. Emerging Biomedical Applications of Synthetic Biology. Nature ReviewsGenetics, 13(1):21–35, January 2012.

[9] T. Knight. Idempotent Vector Design for Standard Assembly of Biobricks Standard Biobrick SequenceInterface. pages 1–11.

[10] R.P. Shetty, D. Endy, and T.F. Knight. Engineering BioBrick Vectors from BioBrick Parts. Journal ofBiological Engineering, 2:5, 2008.

[11] D. Densmore, A. Van Devender, M. Johnson, and N. Sritanyaratana. A Platform-Pased Design Environ-ment for Synthetic Biological Systems. In The Fifth Richard Tapia Celebration of Diversity in ComputingConference: Intellect, Initiatives, Insight, and Innovations, TAPIA ’09, pages 24–29. ACM, 2009.

[12] M. Pedersen and A. Phillips. Towards programming languages for genetic engineering of living cells. Journalof the Royal Society Interface, 6 Suppl 4(April):437–450, August 2009.

[13] J. Beal and J. Bachrach. Cells Are Plausible Targets for High-Level Spatial Languages. pages 284–291. Ieee,October 2008.

[14] J. Beal, T. Lu, and R. Weiss. Automatic Compilation from High-level Biologically-oriented ProgrammingLanguage to Genetic Regulatory Networks. PloS ONE, 6(8):e22490, January 2011.

[15] T. Yi, Y. Huang, M.I. Simon, and J. Doyle. Robust Perfect Adaptation in Bacterial Chemotaxis throughIntegral Feedback Control. Proceedings of the National Academy of Sciences of the United States of America,97(9):4649–4653, April 2000.

[16] J. Stelling, U. Sauer, Z. Szallasi, F.J. Doyle, and J. Doyle. Robustness of Cellular Functions. Cell, 118(6):675–685, September 2004.

[17] C.V. Rao, J.R. Kirby, and A.P. Arkin. Design and Diversity in Bacterial Chemotaxis: A Comparative Studyin Escherichia coli and Bacillus subtilis. PLoS Biology, 2(2):E49, February 2004.

[18] L. Yang and P.A. Iglesias. Positive Feedback May Cause the Biphasic Response Observed in theChemoattractant-induced Response of Dictyostelium Cells. Systems and Control Letters, 55(4):329–337,April 2006.

[19] A.L. Slusarczyk, A. Lin, and R. Weiss. Foundations for the Design and Implementation of Synthetic GeneticCircuits. Nature Reviews Genetics, 13(6):406–420, June 2012.

[20] David B., Subhayu B., and Ron W. Parameter Estimation for Two Synthetic Gene Networks: A Case Study.In the proceedings of International Conference on Acoustics, Speech, and Signal Processing, volume 5, 2005.

10

Page 11: On the Use of MPCCs in Combined Topological and Parametric ... › publications › Guo13a.pdf · Synthetic biology is a eld that involves design of arti cial biological devices and

[21] M. J. Dunlop, J.D. Keasling, and A. Mukhopadhyay. A Model for Improving Microbial Biofuel ProductionUsing a Synthetic Feedback Loop. Systems and Synthetic Biology, 4(2):95–104, June 2010.

[22] G. Batt, B. Yordanov, R. Weiss, and C. Belta. Robustness Analysis and Tuning of Synthetic Gene Networks.Bioinformatics, 23(18):2415–2422, September 2007.

[23] X. Feng, S. Hooshangi, D. Chen, G. Li, R. Weiss, and H. Rabitz. Optimizing Gnetic Circuits by GglobalSensitivity Analysis. Biophysical Journal, 87(4):2195–2202, October 2004.

[24] L.T. Biegler. An Overview of Simultaneous Strategies for Dynamic Optimization. Chemical Engineering andProcessing: Process Intensification, 46(11):1043–1053, November 2007.

[25] J.T. Allison and Z. Han. Co-Design of an Active Suspension Using Simultaneous Dynamic Optimization. Inthe proceedings of the 2011 ASME Design Engineering Technical Conferences. ASME, 2011.

[26] C.R. Hargraves and S.W. Paris. Direct Trajectory Optimization Using Nonlinear Programming and Collo-cation. Journal of Guidance, Control, and Dynamics, 10(4):338–342, 1987.

[27] M.T. Ozimek, D.J. Grebow, and K.C. Howell. Solar Sails and Lunar South Pole Coverage. In the Proceedingsof the 2008 AIAA/AAS Astrodynamics Specialist Conference and Exhibit. AIAA, 2008.

[28] J.T. Betts. Practical Methods for Optimal Control and Estimation Using Nonlinear Programming. SIAM,Philadelphia, PA, 2010.

[29] K.E. Brenan, S.L. Campbell, and L.R. Petzold. Numerical Solution of Initial-value Problems in Differential-Algebraic Equations. SIAM, 1996.

[30] V.V Pontryagin, R. Boltyanskii, E. Gamkredlidze, and Mishchenko. The Mathematical Theory of OptimalProcesses. Interscience Publishers Inc., New York, NY, 1962.

[31] A.E. Bryson and Y.-C Ho. Applied Optimal Control. Hemisphere, 1975.

[32] D.P. Bertsekas. Nonlinear Programming. Athena Scientific, Belmont, Massachusetts, 1999.

[33] U.M. Ascher and L.R. Petzold. Computer Methods for Ordinary Differential Equations and Differential-Algebraic Equations. SIAM, Philadelphia, PA, 1998.

[34] A. Flores-Tlacuahuac, L.T. Biegler, and E. Saldıvar-Guerra. Dynamic Optimization of HIPS Open-LoopUnstable Polymerization Reactors. Industrial and Engineering Chemistry Research, 44(8):2659–2674, April2005.

[35] J.T. Allison, M. Kokkolaras, and P.Y. Papalambros. On Selecting Single-Level Formulations for ComplexSystem Design Optimization. Journal of Mechanical Design, 129(9):898–906, 2007.

[36] S. Kameswaran, G. Staus, and L. T. Biegler. Parameter Estimation of Core Flood and Reservior Models.Computers and Chemical Engineering, (8):1787–1800, 2005.

[37] A. Raghunathan. Ph.D dissertation, Carnegie Mellon Univeristy, 2004.

[38] M. Kocvara and J. Outrata. Effective Reformulations of the Truss Topology Design Problem. Optimizationand Engineering, 7(2):201–219, June 2006.

[39] B.T. Baumrucker and L.T. Biegler. MPEC strategies for optimization of a class of hybrid dynamic systems.Journal of Process Control, 19(8):1248–1256, September 2009.

[40] B.T. Baumrucker. MPEC Problem Formulations in Chemical Engineering Applications. pages 1–28, 2007.

[41] Solodov M.V. Izmailov, A.F. and E.I. Uskov. Global Convergence of Augmented Langrangian MethodsApplied To Optimization Problems with Degenerate Constraints , Including Problems With ComplementarityConstraints. 22(4):1579–1606, 2012.

[42] S. Scholtes. Convergence Properties of a Regularization Scheme for Mathematical Programs with Comple-mentarity Constraints. SIAM J. Optim., 11(4):918–936, 2001.

[43] D. Ralph and S. J. Wright. Some Properties of Regularization and Penalization Schemes for MPECs.Optimization Methods and Software, 19(5):527–556, October 2004.

[44] S.S. Shen-Orr, R. Milo, S. Mangan, and U. Alon. Network Motifs in the Transcriptional Regulation Networkof Escherichia coli. Nature Genetics, 31(1):64–68, May 2002.

[45] R. Milo, S. Shen-Orr, S. Itzkovitz, N. Kashtan, D. Chklovskii, and U. Alon. Network Motifs: Simple BuildingBlocks of Complex Networks. Science, 298(5594):824–827, October 2002.

[46] T.Y. Tsai, Y. S. Choi, W. Ma, J.R. Pomerening, C. Tang, and J.E. Ferrell. Robust, Tunable BiologicalOscillations from Interlinked Positive and Negative Feedback Loops. Science, 321(5885):126–129, July 2008.

[47] A. Wagner. Circuit Topology and the Evolution of Robustness in Two-gene Circadian Oscillators. Proceedingsof the National Academy of Sciences of the United States of America, 102(33):11775–11780, August 2005.

[48] U. Alon, M. G. Surette, N. Barkai, and S. Leibler. Robustness in Bacterial Chemotaxis. Nature,397(6715):168–171, January 1999.

11

Page 12: On the Use of MPCCs in Combined Topological and Parametric ... › publications › Guo13a.pdf · Synthetic biology is a eld that involves design of arti cial biological devices and

[49] N. Barkai and S. Leibler. Robustness in Simple Biochemical Networks to Transfer and Process Information.pages 913–917, 1997.

[50] G. Hornung and N. Barkai. Noise Propagation and Signaling Sensitivity in Biological Networks: A Role forPositive Feedback. PLoS Computational Biology, 4(1):e8, January 2008.

[51] W. Ma, L. Lai, Q. Ouyang, and C. Tang. Robustness and Modular Design of the Drosophila Segment PolarityNetwork. Molecular Systems Biology, 2:70, January 2006.

[52] W. Ma, A. Trusina, H. El-Samad, W. Lim, and C. Tang. Defining Network Topologies that Can AchieveBiochemical Adaptation. Cell, 138(4):760–773, August 2009.

[53] U. Alon. An Introduction to Systems Biology: Design Principles of Biological Circuits. Chapman andHall/CRC, Taylor and Francis Group, I.I.C, Boca Raton, FL, 2007.

[54] R.H. Byrd, J. Nocedal, and R. A. Waltz. Knitro an integrated package for nonlinear optimization. In LargeScale Nonlinear Optimization, 3559, 2006, pages 35–59. Springer Verlag, 2006.

12