Systems Biology ORIGINAL PAPER
Standard Virtual Biological Parts: A Repository of Modular Model-
ing Components for Synthetic Biology
M. T. Cooling1,*,V. Rouilly3, G. Misirli2, J. Lawson1, T. Yu1, J. Hallinan2 and A. Wipat2
1Auckland Bioengineering Institute, University of Auckland, Auckland, New Zealand
2School of Computing Science, Newcastle University, Newcastle upon Tyne NE1 7RU, United Kingdom
3Department of Bioengineering, Imperial College London, London SW7 2AZ, United Kingdom
Motivation: Fabrication of synthetic biological systems is greatly
enhanced by incorporating engineering design principles and tech-
niques such as computer-aided design. To this end, the ongoing
standardization of biological parts presents an opportunity to de-
velop libraries of standard virtual parts in the form of mathematical
models that can be combined to inform system design.
Results: We present an online Repository, populated with a collec-
tion of standardized models that can readily be recombined to model
different biological systems using the inherent modularity support of
the CellML 1.1 model exchange format. The applicability of this ap-
proach is demonstrated by modeling gold-medal winning iGEM ma-
Availability and Implementation: The Repository is available on-
line as part of http://models.cellml.org. We hope to stimulate the
worldwide community to reuse and extend the models therein, and
contribute to the Repository of Standard Virtual Parts thus founded.
Supplementary Information: Systems Model architecture informa-
tion for the Systems Model described here, along with an additional
example and a tutorial, is also available as Supplementary Informa-
The example Systems Model from this manuscript can be found at
The Template models used in the example can be found at
Since the discovery of recombinant DNA technology, scientists
have manipulated living organisms in order to produce biofuels,
drugs, or other biomaterials. Over the years, a biotechnology in-
dustry has emerged exploiting this technology and delivered a
number of successes [Carlson, 2007]. However, in most cases, the
development of biotechnology applications has been the product of
a manually-driven, trial-and-error-based approach.
In order to achieve efficient and reliable biological system fabrica-
tion, synthetic biology promotes the application of engineering
*To whom correspondence should be addressed.
principles such as abstraction, standardization, and characterization
to biology [Endy, 2005]. These concepts have proven to be crucial
in other engineering disciplines in order to mature from ‘dedicated
craftsmanship’ to successful industrial solutions. Arguably, to date
in synthetic biology, the best example of such an approach is the
Registry of Standard Biological Parts (SBPs) [Peccoud et al.
2008]. The Registry (http://www.partsregistry.org) provides a col-
lection of standard DNA parts (BioBricks) [Knight, 2005] that
have been designed to facilitate DNA assembly. Through the
iGEM (International Genetically
http://www.igem.org) competition, the use of the Registry has
clearly demonstrated the power of standardization in biology to
stimulate innovation and creativity [Goodman, 2008].
A critical lesson learnt from other engineering disciplines is that
mathematical modeling can dramatically increase the speed of the
design process as well as reducing the cost of development. A
‘Holy Grail’ in biological modeling would be to design reliable
and robust biological systems in silico prior to fabrication, just as
aeronautic engineers design planes using their computer aided
design (CAD) tools.
CAD tools are already being developed in order to ease the process
of designing synthetic biological systems [Goler et al., 2008].
However, they currently lack access to modular and reusable ma-
thematical models. Accurate models of SBPs are required for the
prediction of system function, but it is also crucial that mecha-
nisms to easily compose part models into complete systems are
available. Therefore, in parallel to increasing the number of parts
available and characterizing them experimentally, a logical exten-
sion to the Registry would be to build a repository of modular
models of SBPs to complement the physical part Registry [Rouilly
et al., 2007].
Here we describe the development of an online repository of Stan-
dard Virtual Biological Parts (SVPs) – mathematical model com-
ponents describing the function of SBPs which can be downloaded,
extended and recombined to aid the design, in silico, of synthetic
Repositories of models are already available, such as the BioMod-
els database [LeNovere et al., 2006]. However, the curated models
in this database are monolithic and do not allow further composi-
tion without some modification. Previous work has already ex-
© The Author (2010). Published by Oxford University Press. All rights reserved. For Permissions, please email: firstname.lastname@example.org
Associate Editor: Prof. Alfonso Valencia
Bioinformatics Advance Access published February 16, 2010
by guest on September 13, 2015
M. T. Cooling et al.
plored the importance of modularity in modeling biological sys-
tems. For example, Rodrigo et al. reports the use of a library of
parts encoded in SBML [Rodrigo et al., 2007]. The composition of
models has also been demonstrated using the modeling system
ProMoT [Mirschel et al., 2009] and the Modeling Description
Language (MDL) [Marchisio and Stelling, 2008]. Both studies
make valuable contributions, however, the model composition
must be supported directly by the software, rather than being sup-
ported directly by the model description language.
CellML [Cuellar et al., 2003] is a widely-used model exchange
protocol supported by domain-nonspecific tools, technologies and
initiatives. Importantly, version 1.1 of the CellML specification
includes explicit support for modularity, allowing the construction
of complex models from components without modification [Cool-
ing et al., 2008]. CellML models are ODE-based, so may not be
applicable when modeling very small numbers of molecules or
where intrinsically stochastic processes (such as noise-induced
phenomena) are considered present and important enough to model
explicitly. However, ODE systems can be considered to represent
the average behavior of a large class of even stochastic systems
assuming that the biological reactions take place in ‘well-stirred’
compartments [Schilstra et al., 2008], and are useful for general
synthetic biological system design. Computer science research has
yielded promising alternative formalisms such as BlenX [Dematte
et al., 2008], or more recently P-systems extended for modularity
[Romero-Campero et al., 2009] which also provide composition of
modular models. However CellML has a proven track record in
representing intracellular processes in systems biology [Hunter &
Borg, 2003], and already has an established framework for multi-
scale modeling [Nickerson et al., 2006], and established tools
[Garny et al., 2008]. These features make CellML an apt choice for
the model representation format.
Since the requirements for standard models may not be known
without varied experience, we advocate a ‘bottom-up’ approach to
the development of a standard via iterative use by, and feedback
from, the community. To begin the process, we present here an
architecture for SVPs and an online repository to support them. We
demonstrate the concepts by developing SVPs for some common
SBP types, and illustrate further by combining these modular
CellML models into models of synthetic biological systems - spe-
cifically, examples from gold-medal winning iGEM projects. We
make all these models publically accessible online for future reuse
and enhancement by the global synthetic biology community.
2 SYSTEM AND METHODS
We begin by describing the overall architecture for SVPs and how
they can be combined into models of synthetic biological systems.
We then describe the Repository developed to cater for the col-
laborative development of models that fit this architecture.
Following insights on model modularization derived in previous
work in Systems Biology [Cooling et al., 2008], mathematical
models of common SBP types from the Registry - promoters, ribo-
some binding sites (RBSes), RNA and protein coding sequences
(CDSes) - were developed. The models were constructed with
well-defined interfaces such that they are composable without
modification (see Section 3 for more details).
Fig. 1A shows a schematic of a simple genetic circuit designed to
produce a protein ‘A’. From a promoter, RNA containing a single
RBS and a CDS is transcribed. These elements are SBPs as might
be contained in the Registry of Standard Biological Parts.
Fig. 1. A) Schematic of a simple genetic circuit and associated bioenvi-
ronmental reactions. Protein A encoded by a ‘composite device’ forms
complex C on combination with protein B. B) CellML model architecture
for the circuit and bioenvironment shown in Fig. 1A. A Systems Model
representing the complete system of interest aggregates specific (shown
with symbols displayed) SVPs from a library of composable CellML mod-
els. From left to right, these include an E.coli chassis (to give volume in-
formation), a promoter, an RBS, a protein CDS, two degradation reactions
(for the RNA, and for protein A) and the C complex formation reaction.
Both the Systems Model and the SVPs may also aggregate components
representing mathematical templates, including, from left to right across
the bottom of Fig. 1B: Time, Well-stirred Bag (used by E. coli chassis
component), Constitutive Promoter (used by the promoter SVP), RBS (used
by the RBS SVP), Protein CDS (used by the protein CDS SVP for species
A), a unidirectional reaction (used by the degradation reaction SVPs) and
a bidirectional reaction (used by the C complex formation reaction SVP).
The species Template (housing an ODE for keeping track of the concentra-
tion of the species) is also used multiple times, once for each molecular
species of interest in the Systems model, including the RNA. SVPs may
represent SBPs or bioenvironmental elements, with the former potentially
being linked to a record in the Standard Biological Parts Registry.
While SBPs cover genetic elements, there are potentially many
other intracellular events occurring in a single cell or chassis.
These include the reactions between gene products which are cru-
cial for the genetic circuit to influence the biological system, and
by guest on September 13, 2015
Standard Virtual Biological Parts: A Repository of Modular Modeling Components for Synthetic Biology
may also include proteins and processes that are abstracted to
lumped-parameter sub-models, such as degradation of gene prod-
ucts or significant reactants. We represent these entities and proc-
esses under the umbrella term of ‘bioenvironment’ models. As
shown in Fig. 1A, protein ‘A’, produced by translation, is a reac-
tant in the bidirectional reaction forming complex ‘C’ on combina-
tion with existing protein ‘B’. This reaction, and the corresponding
degradation reactions, do not relate to SBPs, but are nonetheless
crucial to the functioning of the system. In our formulation, they
are considered part of the bioenvironment, and like SBPs are mod-
eled as SVPs.
Fig. 1B shows how the genetic circuit and associated bioenviron-
ment would be modeled with SVPs. We use three levels of model-
ling abstraction. The top level, denoted ‘Systems Models’ contains
models of entire systems of interest. Systems Models link to mod-
els in the lower levels - the SVPs and Templates - aggregating in-
memory copies of them to build up the desired biological function-
ality. SVPs consist of mathematical formulations that model an
SBP or bioenvironmental function, coupled with associated kinetic
parameters. Their inputs and outputs are so designed that they can
be easily re-used and composed without modification into a ‘Sys-
tems Model’. The lowest level is the ‘Template’ level. Here a
model is given for each specifically different mathematical formu-
lation [Wimalaratne et al., 2009]. For example, one Template is
given for bidirectional mass-action kinetic reactions with two reac-
tants and one product, another is given for constitutive promoters,
a third for promoters with embedded inhibitor functions, and so on.
In addition to the species and processes, time and space are added
to the model through the import of a Time Template, and a set of
Templates relating to cell volumes. In our formulation models are
one-dimensional but compartmentalized into volumes. We provide
a ‘Well-Stirred Bag’ Template to reflect the concept of a three-
dimensional volume, and have, as examples, derived specific vol-
ume components for particular prokaryotic cells.
These Template models make it easy to derive new SVPs, provid-
ing useful general mathematical formulations which only need
parameterizing to become SBP- or bioenvironment-specific. This
modularity and reuse at both Template and SVP levels is made
possible by the modular nature of the CellML language, as will be
discussed further in Section 3.
All of the CellML models discussed in this paper are freely acces-
sible online at permanent, unique locations within the CellML
Model Repository (http://models.cellml.org). At the core of the
Repository lies the PMR2 (Physiome Model Repository 2) soft-
ware. PMR2 is built upon a Distributed Version Control System
(DVCS) which stores and versions the models and associated files.
A web interface layer is provided to access the data and to config-
ure user and access controls for any particular model. This web
interface can also be used to generate content pages which describe
the model and also display metadata.
A synthetic biological system can be considered similar to a soft-
ware program. Initially, this program is constructed in silico for
prototyping, and then it is reconstructed in vitro / in vivo to create
the real system. As such, it makes sense for us to make use of in-
frastructure designed for software development for the in silico
stage of this process. PMR2's DVCS treats models and associated
files, such as documentation, simulation data etc. in a similar man-
ner to software projects, providing researchers with an infrastruc-
ture for collaborative model development. Each file within PMR2
is tightly version controlled, and each version of the model is asso-
ciated with a commit message intended to describe what has been
changed since the previous version.
The web interface allows controlled accessibility to the models
stored within. For example, a modeller may set the permissions to
his model such that only his supervisors or external parties have
access to it for review purposes, and then make it publically acces-
sible once it has been reviewed. This kind of atomic access control
allows researchers to collaborate on models without necessarily
making them public. Because models are assigned permanent,
unique URLs within the Repository, publications can be furnished
with permanent links to associated model code. Once published in
the CellML Model Repository, models will be made freely avail-
able for redistribution and reuse by anyone, as long as proper attri-
bution is made.
Fig. 2. The workspace architecture of SVP-based models in the Repository.
Separate files are shown by the shaded boxes within workspaces. Informa-
tion flows between workspaces via CellML imports (see Section 3.2) are
represented as arrows. Template models are imported and re-
parameterized to form SVPs with specific properties (shown by the darker
shading of the CellML models), which are then combined via imports to
form a model of a synthetic system. Time is imported directly from the
Template library. The system-level workspace contains a single CellML
file, which collates its constituent modules by referencing information in
other workspaces via unique URLs.
Models are uploaded into workspaces, which contain the model
and associated files listed in a manifest. Each file is given a unique
URL which is version-specific and by which the model compo-
nents can be used by other models via CellML imports (see Sec-
tion 3.2 for more details). Each SVP is contained in its own work-
space, since it represents a defined piece of biological functional-
ity, whose model may be revised by the community - perhaps more
accurately parameterized over time, or given alternate mathemati-
cal formulations - independently from other component models.
by guest on September 13, 2015
M. T. Cooling et al.
Alternatively, SVP Templates, since this set is envisaged to change
less frequently, and only by the addition of new Templates, are
considered to be a library of standard mathematical formulations
and are thus grouped together in a single workspace. Synthetic
systems are developed by modelers in their separate workspaces,
using model components from SVP and the Template workspaces
as needed. This architecture is shown in Fig. 2.
Expert curators are responsible for ensuring the coherency, reliabil-
ity and accuracy of the Repository as a data resource by organiz-
ing, indexing and annotating workspaces and their constituent files.
It should be noted that curation is not intended to take the place of
established peer-review by judging the scientific merit of a model;
rather, curators ensure that a set of minimum metadata has been
added to the model. This minimum set of metadata defines who
created the model and when, and any relevant citation information
should the model be related to a publication. Modelers can also
annotate elements of the CellML model with semantic information
about the biological functionality that they represent. Metadata can
be added to a model at any point in its lifecycle, subject to the
approval of either the model author or Repository curators.
The CellML Model Repository is under on-going development to
act as a research community hub. The web interface provides infra-
structure for web-based collection and moderation of user-
generated content. Researchers can work on models together,
download, reuse, modify, annotate or combine models, and discuss
their work, gradually building up the available SVP models for
others to reuse. After submission, curators organize and may anno-
tate models to ensure quality [Peccoud et al., 2008] and to assist
modelers in choosing the appropriate component for their work.
The combination of systems modularity within CellML, together
with the ability to collaborate on documented, versioned libraries
of modular model components in the publically accessible online
Repository, provide a solid platform for rapid in silico prototyping
of synthetic biological systems.
We illustrate the concept of modular modeling by developing
Template (and from them, SVP) models encompassing a range of
useful synthetic biological functions. As a first iteration, we have
chosen to model several core SBP types from the Parts Registry, to
give modelers basic genetic circuit construction functionality:
namely promoters, messenger RNA, RBSes, and CDSes. Termina-
tors did not require a specific Template or SVP in this formulation.
We also develop Templates for some common bioenvironmental
processes such as protein-to-protein and degradation reactions.
First we describe the mathematical formulation of these Templates,
then we discuss how these are encoded into our modeling architec-
ture with CellML. Finally we provide an example of how we have
used our SVPs and Templates to produce a working Systems Mod-
el of a gold-medal-winning iGEM project.
In order to make SVPs composable, it is important to define the
mathematics simply, and with clear interfaces. The formulations
for the Templates (and therefore the SVPs) are designed to make
use of the popular PoPs (polymerases per second) and RiPs (ri-
bosomes per second) units [Braff et al., 2005], as well as express
volumes and concentrations in femtoliters (fL) and nanomolar
(nM) respectively, which are appropriate scales for unicellular
systems. The formulations for the basic Templates developed in
this first iteration will now be described in turn.
Our formulation focuses primarily on proteins and mRNA, and not
on the background transcription/translation machinery of the cell.
In contrast to other systems (such as in [Marchisio and Stelling,
2007]), the concentrations of ribosomes and polymerase are as-
sumed not to be rate limiting, and pools of those potential species
are not modeled explicitly, reducing complexity. However, com-
ponents taking these concentrations into account could be formu-
lated if desired.
The first Template is the promoter which has the general form of:
where j is a constant giving the rate of transcription from the pro-
moter, measured in PoPs. c1(V) is a conversion factor scaling the
PoPs to nM of RNA per second (J) produced from the promoter,
and is a function of the volume V (in femtoliters) of the cellular
compartment where transcription takes place (for more details on
the units and conversion factors used in equations (1)-(5), please
see the Supplementary Information). For a constitutive promoter, j
might simply equal some constant k, but j can also be expressed in
more complex ways for different kinds of promoter. For example,
an inducible promoter might have the formulation:
where I is the concentration of an inducer species, with an associ-
ated coefficient Km, and the Hill coefficient n. Similarly, a re-
pressible promoter might have a formulation for j thus:
where R is the concentration (in nM) of some repressing transcrip-
mRNA is handled as an ODE tracking nM of mRNA from a par-
ticular DNA molecule. It can degrade or participate in reactions
like any other molecular species and so is considered part of the
bioenvironment, in contrast to other schemes in which it is han-
dled more implicitly (for example [Marchisio and Stelling, 2008]
and [Rodrigo et al., 2007]).
The RBS converts the concentration of mRNA for the device into a
flux expressed in RiPs.
by guest on September 13, 2015
Standard Virtual Biological Parts: A Repository of Modular Modeling Components for Synthetic Biology
where k is a rate constant for translation, in units of RiPs. This is
multiplied by the concentration of appropriate mRNA (in nM,
available from the corresponding mRNA species concentration as
in Section 3.1.2), and a conversion factor c2(V) which is a function
of the volume V of the cellular compartment (in femtoliters) in
which translation takes place. The rate of translation R is expressed
in units of RiPs in attomoles, that is, how many attomoles of ri-
bosomes per second are translating the DNA downstream of the
RBS. Attomoles have been chosen to help keep value ranges such
that numerical precision is likely to be maintained.
The Protein CDS formulation is designed to take the attomoles of
RiPs from an upstream RBS (R) and produce a flux J of protein
produced in nM per second:
using the conversion factor c2(V) from Equation (4). The specifica-
tion of V terms in Promoter, RBS and CDS Templates allows mod-
eling of chassis where transcription takes place in a different com-
partment from translation.
Species-to-species interactions are generally modeled according to
mass-action kinetics. For example, a bi-directional reaction with
two reactants and one product would be:
where A, B and C are concentrations of the reactants and product,
respectively, and kf and kr are forward and reverse rate constants,
respectively. Concentrations are measured in nM, and the flux J in
nM per second. A different template would exist for each combina-
tion of uni- or bi-directional reactions of different number of prod-
ucts and reactants [Wimalaratne et al., 2009]. Templates also al-
low reactions to be modeled in different formalisms if appropriate.
For example, an enzymatic reaction might be modeled according to
where k is a rate constant, E is the concentration of the enzyme, R
is the concentration of the reactant, and is associated with the en-
Unlike other formulations where degradation is part of the model
for a Standard Part, for flexibility in our formulation degradation is
modeled as a bioenvironmental process acting on species such as
proteins or mRNA. Degradation is implemented by a unidirec-
tional reaction of a species using mass-action kinetics:
where k is a degradation rate constant, and s is the concentration of
CellML Model Implementation
A Template model was created for each of the equations in the
above formulation, contained in their own CellML components
[Cuellar et al., 2003], and housed in separate files.
These Template components are unparameterized, and are encap-
sulated [Cuellar et al., 2003] into more specific SVP models,
which house the parameter values required for the SVP to reflect a
specific SBP or bioenvironmental process. This encapsulation
means that an SVP can be considered as a separate model describ-
ing the behavior of the SBP or bioenvironmental process on which
it is based, to be (re)used and extended independently from others.
To construct a model of a system, the modeler imports ([Cuellar et
al., 2003]) SVP and Template models relating to the genetic and
bioenvironmental processes of interest into a Systems Model. A
single Template or SVP may be imported many times, such as the
species Template which is imported once for each species the
modeler wishes to track in the model. SVP models do not need to
be modified in order to make the necessary connections between
them - instead, CellML connection [Cuellar et al., 2003] elements
are added so that a Systems Model can be thought of as a network
of chained components. Following the precepts in [Cooling et al.,
2008], interface components written at model aggregation time
handle the combination of flux terms contingent on molecular
species, summing them as appropriate to yield an overall total flux
term. The total flux term is connected back into the Species Tem-
plate that was instantiated for a particular species. This architecture
means that any number of flux terms can be made contingent on a
species simply by adding them to the mathematics in the interface
components, which thus act as malleable ‘glue’ for the aggregation
of immutable SVP models.
SVPs can be easily reused between Systems Models, or even as
multiple copies within a Systems Model (for example, in the case
of multiple copies of a CDS downstream from an RBS), simply by
importing them. A tutorial describing the construction of an exam-
ple Systems Model from existing SVPs, using open-source soft-
ware, is provided in the Supplementary Information.
Alternative formulations of, or even new components - such as
promoters with more detailed mechanisms or including ribosomal
pools - can easily be implemented in CellML as new Templates,
from which SVPs can then be derived. Care would need to be tak-
en to ensure that new formulations have ‘input’ and ‘output’ vari-
ables that are compatible with each other, in order for derived
SVPs to be connectable with one another. CellML’s strict en-
forcement of consistent units reduced errors when connecting
components together, helping to ensure that appropriate connec-
tions are made.
Example Systems Model
To highlight the composability of SVPs and their applicability to
real biological problems, we demonstrate their use by modeling
Newcastle University’s iGEM 2008 gold-medal winning medical
science project ‘BugBuster’, where SVPs were a foundational
technology of the project. A second example, and a tutorial on
constructing a simplified Systems Model from SVPs, is given in
the Supplementary Information.
3.3.1 System Background
by guest on September 13, 2015