Page 1

INSTITUTE OF PHYSICS PUBLISHING

PHYSICAL BIOLOGY

Phys. Biol. 2 (2005) 81–88doi:10.1088/1478-3975/2/2/001

Environmental selection of the

feed-forward loop circuit in

gene-regulation networks

Erez Dekel, Shmoolik Mangan and Uri Alon

Department of Molecular Cell Biology and Department of Physics of Complex Systems,

The Weizmann Institute of Science, Rehovot, 76100, Israel

E-mail: urialon@weizmann.ac.il

Received 23 December 2004

Accepted for publication 31 March 2005

Published 28 April 2005

Online at stacks.iop.org/PhysBio/2/81

Abstract

Gene-regulation networks contain recurring elementary circuits termed network motifs. It is of

interest to understand under which environmental conditions each motif might be selected. To

address this, we study one of the most significant network motifs, a three-gene circuit called

the coherent feed-forward loop (FFL). The FFL has been demonstrated theoretically and

experimentally to perform a basic information-processing function: it shows a delay following

ON steps of an input inducer, but not after OFF steps. Here, we ask under what environmental

conditions might the FFL be selected over simpler gene circuits, based on this function. We

employ a theoretical cost–benefit analysis for the selection of gene circuits in a given

environment. We find conditions that the environment must satisfy in order for the FFL to be

selected over simpler circuits: the FFL is selected in environments where the distribution of the

input pulse duration is sufficiently broad and contains both long and short pulses. Optimal

values of the biochemical parameters of the FFL circuit are determined as a function of the

environment such that the delay in the FFL blocks deleterious short pulses of induction. This

approach can be generally used to study the evolutionary selection of other network motifs.

Introduction

Biological networks contain network motifs: connectivity

patterns that recur in many different systems [1–3]. Network

motifs may be readily detected because they appear much

more often than in randomized networks [1–3]. Transcription

regulation networks show several highly significant network

motifs. Each of the network motifs in transcription networks

has been demonstrated to carry out a basic information-

processing function [4].

One of the most significant network motifs is the feed-

forwardloop(FFL),inwhichatranscription factor Xregulates

a second transcription factor Y, and both jointly regulate gene

Z (or several genes Z1,...,Zn) (figure 1(a)) [1]. The FFL

appears in diverse organisms including E. coli [1–3, 5, 6], B.

subtilis [3, 7], yeast [2, 5, 8, 9], C. elegans [6], fruit-fly [3],

sea urchin [3,10] and humans [11]. For example, sporulation

of B. subtilis is controlled by a transcriptional network made

of several feed-forward loops [7]. Evolution appears to have

independently converged on this motif in different organisms

as well as in different systems within the same organism

[6, 12].

The dynamical behavior of the FFL depends on the

nature of the regulatory interactions (activation or repression)

between X, Y and Z, and on the cis-regulatory input function,

that integrates the effects of X and Y on Z [13–15]. A common

input function is an AND-gate in which both X and Y are

needed to activate Z [5, 6].

possible FFL variants have been analyzed [5, 6].

The most common FFL configuration, called the coherent

type-1 FFL [5], has three activation regulations (figure 1(a)).

This circuit functions as a sign-sensitive delay element

[1, 5, 6]: followingastep-likeadditionofthestimulusofX,Sx,

theoutputgeneZisactivatedatadelay. Thedelayisduetothe

fact that Y must accumulate and cross its activation threshold

in order to activate Z. No delay occurs, however, upon a step-

like removal of the stimulus Sx. This is because only one

input of the AND-gate needs to go off for Z to be deactivated.

The functions of the various

1478-3975/05/020081+08$30.00© 2005 IOP Publishing LtdPrinted in the UK

81

Page 2

E Dekel et al

X

Y

Z

Sx

Sy

geneY

geneZ

X

Y

Z

Sx

Sy

geneZ

(a)(b)

Figure 1. Feed-forward loop (FFL) and simple-AND regulation

circuits. (a) Feed-forward loop, where X activates Y and both jointly

activate gene Z in an AND-gate fashion. The inducers are Sxand Sy.

In the ara system, for example, X = CRP, Y = araC, Z = araBAD,

Sx= cAMP and Sy= L-arabinose. (b) A simple-AND-gate

regulation-circuit, where X and Y activate gene Z. In the lac system,

for example, Y = lacI is a repressor that is induced by Sy= lactose,

X = CRP and Sx= cAMP.

This function can be viewed as a persistence detector: Z is

expressed only in response to sufficiently long pulses of the

input, Sx, whereas rapid deactivation of Z expression occurs

when Sxis removed. These dynamical features have been

experimentally demonstrated in the FFL that regulates the L-

arabinose utilization system of E. coli [6].

Not all systems regulated by two inputs exhibit the FFL:

for example, the lactose system of E. coli [14, 16, 17] is a

simple-AND-gate structure, where X (CRP) does not regulate

Y (LacI) (figure 1(b)). The FFL is found in other E. coli

sugar systems with the same X (CRP), such as the arabinose,

fucoseandmaltosesystems[18–23]. About40%oftheE.coli

operons known to be regulated by two inputs participate in a

FFL [1].

WhatdetermineswhytheFFLisselectedinsomesystems

and not others? It is known that the arrows in regulatory

networks can rapidly change over evolutionary timescales

[9, 12]. For example, it only takes a few point mutations in the

binding site of X in the promoter of Y to abolish the interaction

X → Y. Of the three arrows in the FFL, two are essential for

maintaining the circuits’ AND-gate decision-making logic.

These are the arrows X → Z and Y → Z. The third arrow,

X → Y, can be removed without disrupting the AND-gate

logic of the circuit. Therefore, we can ask, what preserves the

regulation of Y by X in the FFL against mutations that would

rapidly abolish this interaction?

Toaddressthis,weuseatheoreticalevolutionaryapproach

to test the hypothesis that the dynamical properties of the FFL

convey an advantage to the cell under certain environmental

conditions. Evolutionary analysis based on optimality

principles is a classic approach [24]. Examples have been

presented for several design features in biological regulatory

andmetabolicsystems[12,25–48]. Pioneeringstudiesinclude

rules for determining the mode of regulation based on demand

theory [25, 31, 37]; the structure of the pentose–phosphate

pathway as an evolutionary game minimizing the number of

reaction steps [28, 30]; rules for optimal design of metabolic

pathways for maximal efficiency and rapid responses while

minimizing total enzyme production [26, 27, 29, 30, 32,

35, 38, 40, 42, 47]; mathematically controlled comparison

of different designs for genetic switches [12, 31, 34–37,

45]; analysis of optimal genome arrangement in phage [49];

and global optimization of metabolic fluxes [30, 38, 42,

44, 50].

Here, we present a simple model for the selection of the

FFL, based on a cost–benefit analysis of protein action in

a changing environment. We find analytical conditions for

FFL selection in terms of the environmental input distribution.

This may provide an explanation why FFL is found in some

systems and not in others. It also provides insight into the

selected values of the biochemical parameters of the FFL in a

given environment.

Results

Cost–benefit analysis of a simple gene-regulation circuit

We analyze a gene-regulation system with two inputs that

control expression of gene Z. We begin with regulation by a

simple-AND circuit (figure 1(b)) and consider the FFL in the

nextsection. ProductionofproteinZisONataconstantrateβ

in the presence of both input inducers Sxand Sy, and otherwise

zero.

We consider the effects of production of protein Z on

the growth rate of the cells. The cost of Z production entails a

reductioningrowthrate–ηβ, whereβ istherateofproduction

of Z and η is the reduction in growth rate per Z molecule

produced1.

On the other hand, the action of the Z gene-product

conveys an advantage to the cells. This advantage is described

by δf(Z), the increase in growth rate due to the action of Z.

f(Z) is typically an increasing function of Z that saturates at

high values of Z.

An example is the arabinose sugar catabolism system of

E. coli. Here, δf(Z) represents the increase in growth rate due

to the energy and carbon supplied to the cells by catabolism of

the sugar Sy= arabinose. The input signal Sxin the arabinose

system is cAMP, a signaling molecule produced in the cell

upon glucose starvation. In the arabinose system, both Sx=

cAMP and Sy= arabinose need to be present for benefit,

because of catabolite-exclusion in the absence of Sx, e.g. in the

presenceofglucose. Inthissystem,f(Z)istherateatwhichZ

breaksdownsugarSy. ThisratecanbedescribedbyMichaelis–

Menten enzyme kinetics: δf(Z) = δ0vSyZ/(K +Z), where K

1Typically, the costs for the production of the transcription factors X and

Y are negligible compared to the production cost of the effector protein Z

[50], since transcription factors are typically produced in far fewer copies

per cell than enzymes or structural proteins. If Y costs are not negligible,

the advantage of FFL over simple-AND increases, because the FFL prevents

unneeded Y production. Y production costs are included in the detailed model

in the appendix.

82

Page 3

Environmental selection of the feed-forward loop circuit in gene-regulation networks

00.51 1.5

-0.5

0

0.5

1

1.5

pulse width, D/Dc

ϕ(D)

Figure 2. Fitness (integrated growth rate) of simple regulation

during short pulses of inputs Sxand Sy. Fitness is negative for

D < Dc.

is the Michaelis constant of the enzyme, v is the rate at which

Syismetabolizedandδ0istheincreaseingrowthratepersugar

molecule metabolized2.

The overall effect of Z on the growth maximal rate is the

sum of the cost and benefit [25, 30]:

g = −ηβ + δf(Z).

(1)

We now consider a pulse of activation, in which both Sx

and Syare present at saturating levels for a pulse of duration

D. The growth of cells with a simple-AND circuit, integrated

over time D, is given by

?D

When the pulse begins, protein Z begins to be produced

at rate β, and degraded or diluted out by cell growth at rate α

[51]. The dynamics of Z concentration are given by

dZ

dt

resulting in an exponential convergence to steady-state Zm=

β/α

Z(t) = Zm(1 − e−αt).

This solution is in good agreement with high-resolution

gene expression measurements [51].

For long pulses (Dα ? 1), Z is saturated Z = Zm, and has

a net positive effect on cell growth

ϕ(D) =

0

g(t)dt = −ηβD +

?D

0

δf(Z)dt.

(2)

= β − αZ

(3)

(4)

ϕ(D) = −ηβD + δf(Zm)D

(5)

provided that the benefit of Z exceeds its production costs

δf(Zm) >βη.

Short pulses, however, can have a deleterious effect on

growth. To see this, consider short pulses such that Dα ? 1.

In this case Z(t) ∼ βt and using the series expansion f(Z) ∼

f’Z, the integrated growth rate is (figure 2)

?D

2TheMichaelis–MententermappliestothecasewhereSyissaturating. More

generally, the quadratic form of f(Z), which includes sub-saturating Sy, is

described in the appendix.

ϕ(D) =

0

(−ηβ + δf?βt)dt = −ηβD + δf?βD2

2.

(6)

Growth is reduced (ϕ(D) < 0) for pulses shorter than a

critical pulse duration Dc(figure 2)

Dc=2η

Hence, short pulses are deleterious. Simple regulation leads

to reduction in growth in environments with short pulses, even

though Z confers a net advantage for sufficiently long input

pulses (figure 3(a)).

δf?.

(7)

Cost–benefit analysis of the FFL gene circuit

In the FFL (figure 1(a)), upon a pulse of Sx, the transcription

factor Y begins to be produced

dY

dt

= βy− αyY

(8)

and exponentially converges to its steady-state level Ym =

βy/αy

Y(t) = Ym(1 − e−αyt).

(9)

Gene Z in the FFL is regulated in an AND-gate fashion by

X and Y. Therefore, to activate Z, Y needs to accumulate to

levels sufficient to bind the Z promoter and cause activation

of transcription. A simple description of regulation of Z by

Y, allowing analytical solution of the dynamics, is threshold

regulation, where Z is produced at rate β when Y > TY, and not

produced when Y < TY. Many genes are indeed regulated with

sharp regulation functions that resemble threshold regulation

[14, 15, 52]. Other, less sharp, regulation functions yield the

same qualitative results, as discussed in the appendix.

Thus, gene Z is only activated at a delay, at time t = τ

when Y reaches its activation threshold, Y(τ) = TY. The delay,

τ, can be found from equation (9):

?

This equation relates the magnitude of the delay in Z

expression to the biochemical parameters of protein Y. Typical

parameter values in bacteria yield delays of the order of

1–100 min. The delay in the FFL can in principle be tuned

to optimal values by mutations that change these biochemical

parameters. The delay acts to filter out pulses that are shorter

than τ (figure 3(b)). This avoids the reduction in growth for

short pulses:

τ = α−1

yln

1

1 − TY/Ym

?

.

(10)

ϕ(D) = 0for

D < τ.

(11)

However, the filtering of short pulses has a disadvantage,

because during long pulses, Z is produced at a delay and

misses some of the potential benefit of the pulse (figure 3(b)).

To assess whether the FFL confers a net advantage to the

cells, relative to simple regulation, requires analysis of the

distribution of pulses in the environment.

Conditions for FFL selection

The environment of the cell can be characterized by the

probability distribution of the duration of input pulses, P(D).

83

Page 4

E Dekel et al

Cost

Z

Sx

Benefit

Growth

Rate, g

Time

Cost

Z

Sx

Benefit

Growth

Rate, g

Time

(a) (b)

τ τ

Figure 3. Dynamics of gene expression and growth rate in a short, non-beneficial pulse and a long pulse of Sxand Sy. (a) Simple regulation

shows a growth deficit for both pulses, (b) FFL filters out the short pulse, but has reduced benefit during the long pulse. The figure shows

(top to bottom): (1) Pulse of Sxand Sy. (2) Dynamics of Z expression. Z is turned on after a delay τ (τ = 0 in the case of simple regulation),

and approaches its steady-state level Zm. (3) Normalized production cost (reduction in growth rate) due to the production load of Z. Cost

begins after the delay τ. (4) Normalized growth rate advantage (benefit) from the action of gene product Z. (5) Net normalized growth rate.

We assume for simplicity that the pulses are far apart, so that

the system starts each pulse from zero initial Z levels (and Y

levels in the case of the FFL). In this case, the overall fitness

can be found by integrating the fitness ϕ(D) over the pulse

distribution. For simple-AND circuits,

?∞

For FFL circuits, production starts after a delay τ. Pulses

shorter than τ result in no Z production and ϕ(D < τ) = 0.

Long pulses begin to be utilized after a delay τ, so that their

duration is effectively D − τ (figure 3(b)), resulting in

?∞

Note that the simple regulation is equivalent to an FFL with

τ = 0.

The resulting conditions for selection of FFL over simple

regulation are

?1=

0

P(D)ϕ(D)dD.

(12)

?2=

τ

P(D)ϕ(D − τ)dD.

(13)

?2> ?1,?2> 0.

(14)

Simple regulation is selected when

?1> ?2,?1> 0.

(15)

Neither circuit is selected otherwise (?1< 0 and ?2<

0).3For the purpose of this comparison, the FFL is chosen

to have the optimal value for τ (τ which maximizes ?2).

Theseconsiderationsmaptherelationbetweentheselectionof

3Using the present approach, it is easy to show that a cascade design, X →

Y → Z, is never more optimal than an FFL or a simple-AND design. The

reason is that the cascade shows delay after X goes off, resulting in unneeded

production of Z. The FFL avoids these delays because it shows a delay only

after ON steps of Sx and not OFF steps [5, 6]. Indeed, cascades are not

network motifs in any known sensory transcription network [2] although they

are common in developmental transcription networks [59].

these gene circuits and the environment (specifically, relations

between certain integrals of the pulse distribution).

We now consider two specific environments P(D) where

these conditions can be solved analytically.

The FFL is not selected in the case of exponential pulse

distributions

Environments in which pulses have a constant probability per

unit time to end have an exponential pulse distribution

P(D) = D−1

0e−D/D0

(16)

where D0is the mean pulse duration.

Using equations (12) and (13), we find that

?∞

= e−τ/D0

0

?2=

τ

D−1

?∞

0e−D/D0ϕ(D − τ)dD

D−1

0e−D/D0ϕ(D)dD = e−τ/D0?1< ?1.

(17)

Thus, the FFL is never selected since ?2< ?1. Simple

regulation is selected when ?1> 0, which occurs (using

equations (12) and (6)) when the mean pulse duration is long

enough D0> η/δf?. When the mean pulse duration is long

enough D0< η/δf?, simple regulation is not selected because

of the negative effect of the short pulses in the environment.

In this case, gene Z is likely to be lost from the genome on

evolutionary timescales.

Hence, the FFL is not better than simple regulation in an

exponentialpulseenvironment. Inthenextsection,weanalyze

an environment where the filtering properties of the FFL can

be advantageous.

84

Page 5

Environmental selection of the feed-forward loop circuit in gene-regulation networks

0 0.5

pulse width, D1/Dc

11.5

0

0.2

0.4

0.6

0.8

1

Optimal delay,τ/Dc

Figure 4. Optimal delay τ0for an FFL circuit in an environment

with two types of pulses, short pulses of duration D1and long pulses

of duration D2. When D1> Dc, the optimal delay is τ0= 0 and

simple-AND regulation may be selected.

The FFL can be selected in bimodal distributions with long

and short pulses

Consider an environment with two kinds of pulses. A pulse

can have either a short duration D1? Dcwith probability p,

or a long duration D2? 1/α with probability 1 − p.

The short pulses D1are non-beneficial, since they are

shorter than the critical pulse width at which costs equal

benefit, D1<Dc. Incontrast, thelongpulsesD2arebeneficial,

ϕ(D2) = −ηβD2+ δf(Zm),D2> 0.

In this case, it is easy to calculate the optimal delay in

the FFL, τ0(figure 4): the optimal delay is τ0= D1. That is,

the optimal FFL has a delay, which blocks the short pulses

precisely; a longer delay would reduce the benefit of the long

pulses. The condition for selection of FFL over a simple-

AND-gate found by solving equations (12) and (13) is that the

probability of short pulses is large enough

(18)

p > 1 −

ηβ

δf(Zm).

(19)

The phase diagram for selection is shown in figure 5:

when δf(Zm)/ηβ is small, neither circuit is selected

(production costs outweigh benefits). At large δf(Zm)/ηβ,

the FFL is selected if short pulses are common enough

(equation (19)). If short pulses are rare, simple-AND circuits

are selected. At a given p, the higher the ratio of benefit to

cost,δf(Zm)/ηβ,themorelikelytheselectionofsimple-AND

circuits.

Similar considerations apply in general to P(D) with

multiple peaks.Long-tailed pulse distributions, such as

P(D) ∼ D−γwith γ > 2, tend to show FFL selection (data

not shown). Equations (12) and (13) can be used to test

any distribution for its selection properties, and to generate

a selection ’phase diagram’ similar to figure 5.

The present model is a simplified treatment of the

dynamics of these gene circuits. In the appendix, we present

a more detailed model which takes into account the reactions

between an enzyme and its sugar substrate, as well as graded

input functions. The detailed model gives the same qualitative

012

δ

34

0

0.2

0.4

0.6

0.8

1

p

Neither

circuit

selected

FFL Selected

Simple-AND Selected

benefit/cost,

ηβ

.

)Z( f

m

Figure 5. Selection diagram for an environment with two types of

pulses, a short pulse D1with probability p, and a long pulse with

probability 1 − p. The parameter δf(Zm)/ηβ is the ratio of benefit

to production costs of protein Z. Three selection phases are shown,

where FFL, simple-AND regulation or neither circuit is selected.

01234

0

0.2

0.4

0.6

0.8

1

FFL Selected

Simple-AND Selected

p

Neither

circuit

selected

benefit/cost,ηβ

δ

Figure 6. Selection diagram in the environment of figure 5, for the

more detailed model presented in the appendix. The detailed model

includes costs for Y production, graded activation of Z and f(Z)

based on enzyme-ligand binding. Numerical solution of the detailed

model equations was used to find the optimal circuit for each value

of p and δf(Zm)/ηβ.

results as the analytical model discussed above (figure 6,

appendix).

Discussion

We presented a simple analysis of selection of gene-regulation

circuits with two inputs. This analysis is based on a cost–

benefit economy in an environment with a given distribution

of inputs. It yields general conditions on the environment

for selection of FFLs over simple regulation circuits.

find that FFLs can be better than simple regulation in long-

tailed or multi-modal environments with many short pulses.

We

85

Page 6

E Dekel et al

The FFL is better when the environmental parameters are

such that the cell is exposed to frequent short pulses that

cannot be beneficially utilized.

in environments with exponential pulse distribution.

FFL is only useful in environments where pulse duration can

effectively be predicted based on whether it has outlasted a

given delay. The optimal delay in the FFL can also be readily

calculated for each environment.

The present cost–benefit analysis compares production

costs with benefits under a time-varying environment. This

cost is used as a criterion for a ’mathematically controlled

comparison [25, 31, 37] between different designs. It can be

extendedtoaskwhethertheoptimalcircuitisanevolutionarily

stable solution [53]. More generally, it would be important to

experimentallytestwhetheroptimalityconsiderationsarevalid

for gene circuits.

It is interesting to qualitatively apply the present analysis

tothecaseofsugarsystemsinE.coli. WhyistheFFLselected

in some sugar system, such as the arabinose (ara) system

[18–22], whereas simple-AND is selected in others, such as

the lactose (lac) system [16]?

Both ara and lac systems share the same X = CRP, a

transcription activator stimulated by Sx= cAMP, a signaling

molecule produced in the cell upon glucose starvation. Thus,

both ara and lac systems have the same Sxpulse distribution.

Accordingtoourmodel,selectionofcircuittypewoulddepend

on the ratio of benefit to cost δf(Zm)/ηβ, in each system.

The benefit per lactose molecule (which is split into glucose +

galactose)isknowntobegreaterthanthebenefitperarabinose

molecule (approximately 70 ATPs per lactose utilized versus

approximately 30 ATPs per arabinose). Thus, the parameter

δf(Zm)/ηβ for the ara system may be more to the left in

figure 5 relative to the lac system, favoring selection of FFL

in the former.

Furthermore, the availability of Sy in the natural

environment of E. coli is different in the two systems. The

sugar arabinose (Syin the ara system) is thought to be far

more common than lactose (Syin the lac system) over most of

the natural habitat of E. coli within its mammalian host [37].

Wedonot, however, knowthejointprobabilitydistributionfor

pulses of the two signals Sxand Syin the natural environment.

The present theory suggests how differences in the joint pulse

distributions of the two sugars might affect FFL selection.

Evolutionary cost–benefit analysis can also explain the

selection of the values of the biochemical parameters in a

given circuit [12, 25–48], as demonstrated by calculating the

optimal FFL delay τ (equations (12)–(16)) as function of the

environment. The value of τ is predicted to be on the time-

scale of the deleterious short Sxpulses in the environment.

In the ara system of E. coli, τ was experimentally found to

be about 0.2 cell generations (about 20 min) [6]. Indeed,

Sx(cAMP) is known to have spike-like pulses on a similar

time-scale when E. coli cells make transitions between

carbon sources [16] or undergo sudden changes in growth

rate [54].Therefore, the FFL in this system may have

’learned’ the typical timescale of deleterious input pulses in

the environment.

The FFL is not selected

The

Conclusions and outlook

The present study examined the selection of a network

motif, the feed-forward loop, over simpler regulation circuits,

using cost–benefit analysis. The selection between simple

regulationorFFLwasdeterminedasafunctionofthedynamic

distribution of input signals in the organisms’ environment.

This study makes predictions that are, in principle,

experimentally testable. For example, the theory could be

tested by studying a gene-regulation system in cells evolving

under laboratory environments [55–58] of pulse distributions.

One could then track the evolution of circuit architectures that

according to the theory should be either selected or lost.

We currently have more information about the structure

of some gene circuits than about the precise ecology in which

they evolved. The present approach makes predictions on the

environment based on the observed gene-regulation networks.

Itmaybeconsideredasaformof’inverseecology’,suggesting

constraints on the possible environments that could give rise

to observed circuits. It would be interesting to analyze the

environmental selection of the structure and parameters of

other gene circuits.

Acknowledgments

We thank all members of our lab and M Savageau, M Elowitz,

R Heinrich and E Klipp for discussions.

supported by NIH, Minerva and ISF. ED was supported by

a Clore postdoctoral fellowship.

This study was

Appendix

We analyze a detailed model based on E. coli sugar

utilization systems. The analysis employs the large separation

of timescales in the problem: sugars bind and activate

transcription factors within milliseconds, transcription factors

bind to promoters within seconds and transcription changes

proteinlevelsonthescaleofminutesormore. Rapidreactions

are therefore taken at steady state within the equations for

slower reaction.

Protein production dynamics

XispresentataconstantlevelXst. YisregulatedbyXintheFFL

configuration, and is not regulated in the case of simple-AND

configuration. Transcription factor X becomes active when it

binds Sx. When no Sxis present, X is in its inactive form, X∗=

0. When saturating Sxis added, X∗= Xst. The active protein

X∗bindsitssiteinthepromoterofYwithdissociation constant

Kxy, resulting in a Michaelis–Menten term for the promoter

activity of Y. As a result, Y is produced and degraded/diluted

according to

X∗

X∗+ Kxy

When Syis present at saturating levels, we have

dY

dt

= βy

− αY.

(A.1)

Y∗= Y.

(A.2)

86

Page 7

Environmental selection of the feed-forward loop circuit in gene-regulation networks

Z is regulated by both X and Y, which bind the Z promoter

with dissociation constants Kxz and Kyz, respectively. We

assumeforsimplicitythattheybindindependently. Therefore,

the probability that both X∗and Y∗bind their sites in the Z

promoter is the product of the Michaelis–Menten probabilities

of bindings,

X∗

X∗+ Kxz

The resulting dynamics of Z expression is

dZ

dtX∗+ Kxz

Y∗+ Kyz

IntheFFLconfiguration,YbeginstobeproducedwhenXbinds

to Sxattime t=0according toequation (A.1):

Y(t) =βy

α

andtheanalyticalsolutionforZ(t)is(equation(A.4))

?

+Z0e−αt,

P =

Y∗

Y∗+ Kyz.

(A.3)

= βz

X∗

Y∗

− αZ.

(A.4)

Xst

Kxy+ Xst(1 − e−αt) + Y0e−αt

(A.5)

Z(t) =

a1

1 + a2

1 − e−αt

?ln(eαt(1 + a2) − a2)

1 + a2

+ 1

??

(A.6)

where

a1=

βzβy

Kyzα2

Xst

Kxz+ Xst

βy

Kyzα

Xst

Kxy+ Xst

Xst

Kxy+ Xst

(A.7)

a2=

(A.8)

and Y0, Z0 are the values of Y, Z at time t = 0.

similar manner, one can readily construct the solution for

environmental conditions, that changes between piecewise

constant values of Sxand Sy.

In a

Cost–benefit analysis

We now describe the effective optimization goal in order to

comparethedifferentcircuits. Thegoalistooptimizethemean

growth rate integrated over time. The growth rate is

g = −ηxβx− ηyβy

Xst

Xst+ Kxy

−ηzβz

Xst

Xst+ Kxz

Y

Y + Kyz

+ δ[ZSy],

(A.9)

where ηxβx, ηyβyand ηzβzare the growth cost for producing

X, Y and Z. The last term represents the benefit from Sy

metabolism, which is proportional to the action of enzyme

Z and its substrate Syupon binding. Z and Syform a complex

[ZSy] whose concentration at equilibrium is

[ZSy] = Kz[Z][Sy](A.10)

where Kzis the dissociation constant of enzyme Z to sugar

Sy. Two conservation laws for Z and Syhold

[ZT] = [Z] + [ZSy],

?ST

y

are the total (bound and unbound)

concentrations of Z and Sy.

(A.11)

y

?= [Sy] + [ZSy],

(A.12)

where ZT and ST

Equations (A.10)–(A.12) can be solved to yield a

quadratic form for [ZSy]:

y]+[ZT]+Kz−√([ST

[ZSy] =

[ST

y]−[ZT])2+2Kz([ST

2

y]+[ZT])+K2

z

.

(A.13)

Note equation (A.13) for [ZSy] reduces to standard Michaelis–

Menten forms when enzyme concentration is much lower than

the sugar concentration, or vice versa.

The benefit function is the rate of metabolism of Sytimes

the growth advantage per Symolecule metabolized, δ0. In the

Michaelis–Menten enzyme picture, the velocity of enzyme Z

is v[ZSy], and

δf(Z) = δ0v[ZSy],

where v is the velocity of enzyme Z.

(A.14)

Optimal designs

We compare the FFL and the simple-AND circuits under

differentenvironmentalconditions. Foragivenenvironmental

conditions (Sx(t) and Sy(t) profiles), we calculated the

dynamics of Y and Z using equations (A.1)–(A.14). Then,

using the fitness function (equation (A.9)), we calculated

the temporally integrated growth rate of cells with FFL or

simple-AND circuits. We optimized the growth rate of cells

by finding the optimal values for the affinity and production

constants βy,βz,Kxy,Kyz,Kxz that give maximal growth.

The optimization was done separately for the FFL and for the

simple-AND configurations by using numerical Nelder–Mead

simplex optimization (Matlab 6.5). The optimal growth rate

of the two circuits was used to calculate the selection diagram

(figure 6).

Glossary

Cost–benefit analysis.

on the cells economy of costs and benefits. Production of

proteins costs energy and other resources and therefore

reduces the cells growth rate. The benefit comes from the

function of the proteins (for example, the utilization of sugar

by enzymes) that increases the growth rate. Cost–benefit

analysis can design the optimal protein levels that maximize

a fitness function such as growth rate.

Evolutionary analysis that is based

Inverse ecology.

ecology of an organism based on the structure of the gene

circuits that have evolved in that ecology.

Finding constraints on the possible

Phase diagram.

are parameters of the system, into regions in which the

system behavior has a particular characteristic. When the

region boundaries are crossed, the system characteristic

abruptly changes.

Diagram that sections a space, whose axes

Simple regulation, simple-AND-gate regulation.

A configuration where transcription factor X and

transcription factor Y both regulate gene Z, but X does not

regulate Y and vice versa. Both inputs are needed to be active

in order to cause transcription of the gene.

87

Page 8

E Dekel et al

Feed-forward loop (FFL).

transcription factor X regulates transcription factor Y and

both regulate gene Z. In this study, we considered an FFL

where both X and Y are needed to activate Z (an AND-gate

coherent type-1 FFL according to [5]).

A gene circuit in which

References

[1] Shen-Orr S S, Milo R, Mangan S and Alon U 2002 Nat. Genet.

31 64–8

[2] Milo R, Shen-Orr S, Itzkovitz S, Kashtan N, Chklovskii D and

Alon U 2002 Science 298 824–7

[3] Milo R, Itzkovitz S, Kashtan N, Levitt R, Shen-Orr S,

Ayzenshtat I, Sheffer M and Alon U 2004 Science 303

1538–42

[4] Alon U 2003 Science 301 1866–7

[5] Mangan S and Alon U 2003 Proc. Natl Acad. Sci. USA 100

11980–85

[6] Mangan S, Zaslaver A and Alon U 2003 J. Mol. Biol. 334

197–204

[7] Eichenberger P et al 2004 PLoS Biol. 2 1664–83

[8] Lee T I et al 2002 Science 298 799–804

[9] Teichmann S A and Babu M M 2004 Nat. Genet. 36 492–6

[10] Davidson E H 2002 Science 295 1669–78

[11] Odom D T et al 2004 Science 303 1378–81

[12] Conant G C and Wagner A 2003 Nat. Genet. 34 264–6

[13] Buchler N E, Gerland U and Hwa T 2003 Proc. Natl Acad. Sci.

USA 100 5136–41

[14] Setty Y, Mayo A E, Surette M G and Alon U 2003 Proc. Natl

Acad. Sci. USA 100 7702–7

[15] Bolouri H and Davidson E H 2002 Bioessays 24 1118–29

[16] Kremling A, Bettenbrock K, Laube B, Jahreis K, Lengeler J W

and Gilles E D 2001 Metab. Eng. 3 362–79

[17] Ozbudak E M, Thattai M, Lim H N, Shraiman B I and

Van Oudenaarden A 2004 Nature 427 737–40

[18] Wilcox G, Meuris P, Bass R and Englesbe E 1974 J. Biol.

Chem. 249 2946–52

[19] Casadaban M J 1976 J. Mol. Biol. 104

557–66

[20] Johnson C M and Schleif R F 1995 J. Bacteriol. 177 3438–42

[21] Schleif R 1996 Two positively regulated systems, ara and mal

Escherichia coli and Salmonella vol 1 ed F C Neidhardt

(Washington, DC: ASM Press) pp 1300–9

[22] Schleif R 2000 Trends Genet. 16 559–65

[23] Brown C T and Callan C G Jr 2004 Proc. Natl Acad. Sci. USA

101 2404–09

[24] Rosen R 1967 Optimality Principles in Biology (London:

Butterworths)

[25] Savageau M 1976 Biochemical Systems Analysis: A Study of

Function and Design in Molecular Biology (Reading, MA:

Addison-Wesley)

[26] Heinrich R and Holzhutter H G 1985 Biomed. Biochim. Acta

44 959–69

[27] Arkin A and Ross J 1994 Biophys. J. 67 560–78

[28] Melendez-Hevia E, Waddell T G and Montero F 1994 J. Theor.

Biol. 166 201–20

[29] Heinrich R and Klipp E 1996 J. Theor. Biol. 182 243–52

[30] Heinrich R and Schuster S 1996 The Regulation of Cellular

Systems (London: Chapman and Hall)

[31] Savageau M A 1998 Genetics 149 1677–91

[32] Klipp E and Heinrich R 1999 Biosystems 54 1–14

[33] McAdams H H and Arkin A 2000 Curr. Biol. 10 R318–20

[34] Ouzounis C A and Karp P D 2000 Genome Res. 10 568–76

[35] Timothy S G, Cantor C R and Collins J J 2000 Nature 403

339–43

[36] Becskei A and Serrano L 2000 Nature 405 590–3

[37] Savageau M A 2001 Chaos 11 142–59

[38] Ibarra R U, Edwards J S and Palsson B O 2002 Nature 420

186–9

[39] Hasty J 2002 Proc. Natl Acad. Sci. USA 99 16516–8

[40] Yokobayashi Y, Weiss R and Arnold F H 2002 Proc. Natl

Acad. Sci. USA 99 16587–91

[41] Lenski R E, Ofria C, Pennock R T and Adami C 2003 Nature

423 139–44

[42] Fong S S, Marciniak J Y and Palsson B O 2003 J. Bacteriol.

185 6400–8

[43] Xie G, Keyhani N O, Bonner C A and Jensen R A 2003

Microbiol. Mol. Biol. Rev. 67 303–42

[44] Thattai M and Shraiman B I 2003 Biophys. J. 85 744–54

[45] McAdams H H, Srinivasan B and Arkin A P 2004 Nat. Rev.

Genet. 5 169–78

[46] Wall M E, Hlavacek W S and Savageau M A 2004 Nat. Rev.

Genet. 5 34–42

[47] Zaslaver A, Mayo A E, Rosenberg R, Bashkin P, Sberro H,

Tsalyuk M, Surette M G and Alon U 2004 Nat. Genet. 36

486–91

[48] Reich J G 1983 Biomed. Biochim. Acta. 42 839–48

[49] Endy D, You L, Yin J and Molineux I J 2000 Proc. Natl Acad.

Sci. USA 97 5375–80

[50] Segre D, Vitkup D and Church G M 2002 Proc. Natl Acad.

Sci. USA 99 15112–7

[51] Rosenfeld N, Elowitz M B and Alon U 2002 J. Mol. Biol. 323

785–93

[52] Perkins T J, Hallett M and Glass L 2004 J. Theor. Biol. 230

289–99

[53] Maynard Smith J 1974 J. Theor. Biol. 47 209–21

[54] Weber J, Kayser A and Rinas U 2005 Microbiology 151

707–16

[55] Dykhuizen D E and Hartl D L 1983 Microbiol. Rev. 47 150–68

[56] Elena S F and Lenski R E 2003 Nat. Rev. Genet. 4 457–69

[57] Cooper T F, Rozen D E and Lenski R E 2003 Proc. Natl Acad.

Sci. USA 100 1072–7

[58] De Visser J A and Lenski R E 2002 BMC Evol. Biol. 2 19

[59] Rosenfeld N and Alon U 2003 J. Mol. Biol. 13 645–54

88