ArticlePDF Available

Abstract

From the late 1980s onward, the term "bioinformatics" mostly has been used to refer to computational methods for comparative analysis of genome data. However, the term was originally more widely defined as the study of informatic processes in biotic systems. In this essay, I will trace this early history (from a personal point of view) and I will argue that the original meaning of the term is re-emerging.
Perspective
The Roots of Bioinformatics in Theoretical Biology
Paulien Hogeweg*
Theoretical Biology and Bioinformatics Group, Department of Biology, Faculty of Science, Utrecht University, Utrecht, The Netherlands
Abstract: From the late 198 0s
onward, the term ‘‘bioinformatics’’
mostly has been used to refer to
computational methods for com-
parative analysis of genome data.
However, the term was originally
more widely defined as the study
of informatic processes in biotic
systems. In this essay, I will trace
this early history (from a personal
point of view) and I will argue that
the original meaning of the term is
re-emerging.
Early History: Bioinformatics, a
Work Concept
In the beginning of the 1970s, Ben
Hesper and I started to use the term
‘‘bioinformatics’’ for the research we
wanted to do, defining it as ‘‘the study of
informatic processes in biotic systems’’.
(Although several public sources [see
below] trace the origin of the term to
publications by us that appeared in 1978
[1,2], in fact we were using it as early as
1970, proposing the definition above in an
article in Dutch that is not generally
accessible [3].)
It seemed to us that one of the defining
properties of life was information process-
ing in its various forms, e.g., information
accumulation during evolution, informa-
tion transmission from DNA to intra- and
intercellular processes, and the interpreta-
tion of such information at multiple levels.
At a minimum, we felt that that informa-
tion processing could serve as a useful
metaphor for understanding living sys-
tems. We therefore thought that in
addition to biophysics and biochemistry,
it was useful to distinguish bioinformatics
as a research field (or what we termed a
‘‘work concept’’).
Indeed, at the birth of molecular
biology it was recognized that a central
research theme should be how living
systems gather, process, store, and use
information [4]. This focus on concepts
related to information is, for example,
reflected in the terminology ‘‘genetic
code’’, the central dogma as the unidirec-
tional flow of information, etc. A nice
monograph entitled ‘‘From Deoxyribonu-
cleic Acid to Protein: Transfer of Genetic
Information’’ [5] summarized the state of
the art in molecular biology before the
‘‘sequence age’’, unraveling for me the
essential processes that, at the time in
genetics undergraduate texts, were buried
in ‘‘bead genetics’’. It seems that recently,
after a dormant phase, such information-
centric terminology has become more
prevalent again (e.g., in terms of identify-
ing a distinct research field [4] and
focusing on such processes as sensing the
environment [6] and dynamic phosphor-
ylation and methylation codes [7,8]).
We were embedded then within theo-
retical biology. At the time, after general
systems theory [9,10] had come and gone,
theoretical biology was in a mild resur-
gence in acceptance. The series of books
entitled ‘‘Towards a Theoretical Biology’’,
edited by Waddington [11] (reprints of
which are underway), had appeared a few
years earlier. In 1972, the main topic at a
meeting organized by BSRC (Biological
Science Research Council) Developmental
Biology in collaboration with the Society
for Experimental Biology was mathemat-
ical models of development.
Stuart Kaufman was there, presenting
his work on random Boolean networks,
which introduced the concept of large-
scale transcription regulation networks
and viewed a cell type as an attractor in
a multidimensional dynamical system
[12]. It is striking that in the year 2000,
Huang and Ingber reintroduced these
concepts to the experimental molecular
biology community [13] and later beauti-
fully illustrated their power by demon-
strating alternative trajectories to neutro-
phil differentiation on the basis of
temporal gene expression data of 2,773
genes [14].
At this same meeting, models and
experiments in such areas as oscillatory
enzyme dynamics (e.g., [15,16]), positional
information [17], and bi-stability in gene
regulation [18] were presented and hotly
discussed. Spatial pattern formation was
one of the central topics, contrasting
Turing systems [19] with gradient-based
systems [17]. Francis Crick, who in that
period published some papers on gradients
in development [20], attended the meet-
ing. Skeptical about the emphasis Turing
Patterns were (still) receiving, Crick quoted
Turing as saying in reaction to enthusiasm
about his work: ‘‘Well, the stripes are easy
but what about the horse part?’’ To go
‘‘for the horse part’’, i.e., to go beyond
pattern formation to multilevel models of
development and morphogenesis, became
one of the long-term goals of our nascent
work concept ‘‘bioinformatics’’.
Also at about that time, John Maynard
Smith gave a lecture in Utrecht and posed
a similar challenge with respect to evolu-
tionary biology as Turing’s challenge
relative to developmental biology. While
evolutionary models mainly dealt with
invasion of mutants and changing allele
frequencies, the question of how evolution
leads to complex organisms was not
addressed. As Maynard Smith expressed
it: ‘‘As good evolutionary biologists we
should go once a year to the zoo and visit
the elephant. We should greet it and say
‘Elephant, I believe you got about by
random mutation’’’. To meet the chal-
lenge of a ‘‘constructive evolutionary
biology’’ became another long-term goal
of bioinformatics as we envisioned it.
Research in artificial intelligence at this
time was exploring new representations of
information processing systems, often in-
spired by biological systems, e.g. neural
network models for learning and pattern
recognition [21,22], genetic algorithms
[23] for optimization, ‘‘actors’’ [24] for
Citation: Hogeweg P (2011) The Roots of Bioinformatics in Theoretical Biology. PLoS Comput Biol 7(3):
e1002021. doi:10.1371/journal.pcbi.1002021
Editor: David B. Searls, Philadelphia, United States of America
Published March 31, 2011
Copyright: ß 2011 Paulien Hogeweg. This is an open-access article distributed under the terms of the
Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any
medium, provided the original author and source are credited.
Funding: The author received no specific funding for this article.
Competing Interests: The author has declared that no competing interests exist.
* E-mail: P.Hogeweg@uu.nl
PLoS Computational Biology | www.ploscompbiol.org 1 March 2011 | Volume 7 | Issue 3 | e1002021
semi-independent parallel processing, and
‘‘turtle geometry’’ [25,26], demonstrating
the power of an individual self-centered
approach to generating and/or under-
standing more global structures.
We felt that the re-introduction of
biologically inspired computational ideas
back into biology was needed in order to
begin to understand biological systems as
information processing systems. In partic-
ular, a focus on local interaction leading to
emergent phenomena at multiple scales
seemed to be missing in most biological
models.
At the time, molecular biology was of
course not a heavily ‘‘data-driven’’ science,
as it would become with the advent of
massive sequencing projects. Indeed, data-
driven science was looked down upon,
both in molecular biology and in theoret-
ical biology. However, data-driven re-
search was being done in the more
traditional parts of biology, ecology, and
taxonomy. I had just finished a data
collection survey on water plant vegetation
in India, Czechoslovakia, and The Nether-
lands and had become dissatisfied by the
local state of the art of data processing,
which comprised shuffling large tables by
hand. At the same time, pattern recogni-
tion methods had already been introduced
as ‘‘numerical taxonomy’’ [27], as well as
in ecology [28,29]. Although modeling
and pattern analysis were (and still often
are) seen as separate endeavors, we felt
that for bioinformatic research they were
both needed and should be combined:
first, to analyze patterns of variation at
multiple levels in organisms; second, to
detect emergent phenomena in models;
third, to compare the outcome of such
models with ‘‘real’’ data; and finally, and
most profoundly, because the relationship
between genotype, phenotype, behavior,
and environment itself can be seen as a
type of pattern recognition or pattern
transformation [30,31], and understand-
ing these processes was the core of
bioinformatic research.
In short, under the heading of bioinfor-
matics we wanted to combine pattern
analysis and dynamic modeling and apply
them to the challenge of unraveling
pattern generation and informatic process-
es in biotic systems at multiple scales.
Bioinformatics before the Data
Deluge
But what could actually be done given
the scarcity of data and paucity of
computing power?
In fact, many of the basic pattern
analysis methods now used in bioinfor-
matics were pioneered in the 1960s (for a
nice historical overview see [32]) and
further developed in the 1970s. However,
with respect to methods and data it was
still a matter of everyone for themselves, as
no easy exchange was possible. A notable
exception was, of course, the work of
Dayhoff to make protein sequences avail-
able through the yearly printed atlases of
protein sequences and structure (from [33]
to [34]). Accordingly, we spent much time
in developing BIOPAT, an integrated set
of supervised and nonsupervised pattern
analysis methods, though at the same time
we strenuously argued that methods de-
velopment was NOT what bioinformatics
was about.
We used the pattern analysis methods to
study both ‘‘real’’ data and data derived
from modeling studies. Our questions
revolved around relating patterns of var-
iation at different levels of organization.
This included a first foray into non-linear
genotype/phenotype mapping [35], using
the developmental ‘‘grammars’’ intro-
duced by Lindenmayer [36,37], to dem-
onstrate that the pattern of variation at the
level of the genotype (the developmental
rules) and at the level of the phenotype
(the generated ‘‘morphemes’’) does not
necessarily coincide (as implicitly assumed
in phylogenetic studies based on morpho-
logical data). We developed cluster analy-
sis methods with iterative character
weighting [38] to tease apart intermingled
patterns of variation. Thus we could, for
example, untangle morphological varia-
tion due to lineage differences and due to
polyploidy [38]. In hindsight, it is inter-
esting to recall the surprise (and dismay of
the editors) when we found that isozyme
variation was not correlated with lineage
but with climatic conditions [39]. The
general expectation was that, the closer to
the genome, the closer to the ‘‘real’’
evolutionary relationships.
In the 1970s and 1980s, not only were
pattern analysis methods developed, but
novel modeling formalisms also were
actively explored. Nonlinear systems start-
ed to become analyzable due to computer
modeling, and new developments, for
instance phase plain analysis, bifurcation
diagrams, and deterministic chaos, were
linked to biological applications (e.g., the
logistic growth model is a prototype for
deterministic chaos [40]).
Moreover, event-based modeling for-
malisms were developed; most well-known
is the Gillespie algorithm developed for
simulating chemical kinetics [41]. Our
interests being on information processing
and micro-macro transitions (emergent
phenomena), we focused on the use and
development of modeling formalisms im-
plementing local interactions. Thus, we
introduced cellular automata as a model-
ing formalism in ecology [42] and evolu-
tion [43], and developed event-based,
individual-oriented (now usually called
agent-based) simulation approaches.
Because of the often surprising and
counterintuitive results of such models,
we emphasized a bottom-up modeling
methodology. Instead of designing a
model to explain a priori well-defined
results, in such a bottom-up modeling
methodology known (or assumed) basic
interactions are implemented, and the
resulting dynamics are analyzed in multi-
ple ways and at multiple levels. If and only
if various seemingly unrelated and unfore-
seen consequences of the model corre-
spond to the modeled system, this gives
truly novel insight (and confidence in the
model) [44,45]. To analyze such models,
pattern analysis methods can be indispens-
able to relate the outcome of the models to
‘‘real’’ data. For example, this allowed us
to demonstrate that the behavioral pat-
terns, division of labor, and adaptation to
the environment observed in bumble bee
colonies were emergent properties of local
interaction of simple entities that ‘‘do what
there is to do’’ [46–48].
Data-Driven Bioinformatics
I recall the excitement when, in 1982,
the first European Molecular Biology
Laboratory sequence tape was delivered.
Typing in data (on punch cards) from the
Dayhoff atlases was cumbersome, even
though many aligned sequences were
provided. But what to do with this ‘‘mess’’
of data? Just for fun, we clustered species
on nucleotide and dinucleotide content.
To our surprise (and actually, dismay), a
more or less decent classification emerged!
This, in spite of our mantra that simple
‘‘amounts’’ would not take us very far in
biology and we needed to look at patterns/
information. But now we were back in the
situation of almost a decade before: people
trying to make sense of data by shuffling it
around and finding by ‘‘eye/hand’’ some
optimal arrangement, now with respect to
aligning sets of sequences.
By developing an iterative guide tree-
based multiple alignment method [49], we
opened up this rich resource for our
bioinformatic research. We pursued our
earlier themes of coding structures and
genotype/phenotype mapping through
the study of RNA primary and secondary
structure. It is gratifying that some of the
multiple coding issues we studied are now
being re-examined and that patterns we
PLoS Computational Biology | www.ploscompbiol.org 2 March 2011 | Volume 7 | Issue 3 | e1002021
gleaned from the sparse data available at
that time are now being verified through
large-scale data analysis and direct high-
throughput experiments. For example, we
found that selection pressure on mRNA is
not only related to protein coding but also
to its secondary structure [50,51], and
inferred that ‘‘synonymous’’ mutations are
therefore not necessarily neutral. Recently
[52], it was inferred that conflicting
selection pressures on synonymous codon
use suggest just such selection pressure on
secondary structure. As another example,
we showed that a common pattern in
mRNA secondary structure was a loosely
folded 59end in eukaryotic mRNA [53],
apparently to facilitate translation initia-
tion, a finding that has now been firmly
established [54–56].
Propelled by the exponential increase of
sequence data, the term bioinformatics
became mainstream in the late 1980s,
coming to mean the development and use
of computational methods for data man-
agement and data analysis of sequence
data, protein structure determination,
homology-based function prediction, and
phylogeny. But the rich insights obtained
from the massive sequencing projects, and
the related bioinformatic analysis to un-
ravel function and evolution, is not really
the ‘‘roots of bioinformatics’’, but rather
the ‘‘trunk of bioinformatics’’, and not the
subject of this article.
Back to the Future
In 2002, I received a surprising e-mail
from Oxford University Press: ‘‘It appears
that you may be responsible for the term
‘bioinformatics’. I am preparing an entry
for the word in the Oxford English
Dictionary, and in this connection am
investigating its history. . .’’ This led to our
1978 papers on chaotic dynamics in
ecological models [1], and genotype phe-
notype mapping in growth models [2]
being credited as the source of the term
(though, as noted, our usage of it dated
back to 1970). But was our definition of
bioinformatics as the study of informatic
processes in biotic systems at multiple
levels just an historical quirk, to be
superseded by the common meaning of
the term as denoting the development and
use of computational methods for com-
parative analysis of genome data?
The set of fully sequenced genomes
(including human) was expanding, and
high-throughput ‘‘omics’’ data entered the
field, adding new dimensions to data-
driven comparative research. Organisms
were no longer just a ‘‘bag of genes or
proteins’’ but also, e.g., a ‘‘bag of tran-
scriptomes’’, ‘‘a bag of interactomes’’, and
‘‘a bag of metabolomes’’. Integrating these
various data is a marvelous opportunity
and great challenge for bioinformatics in
whatever sense of the word!
Indeed, the insight has again taken hold
that organisms are not just a bag full of
anything, but rather complex dynamical
systems, and that an understanding of their
functioning requires dynamical modeling.
Under the heading ‘‘systems biology’’,
modeling efforts have been revived, and
some of these efforts reflect the problems
and dilemmas encountered already in the
1970s. How far can models be simplified
and still be relevant? (Recall Einstein’s
dictum that ‘‘models should be as simple
as possible but not more so’’.) How can
models be sensibly scaled up so as to meet
the complexity revealed by the genomic
data and still be manageable? As was the
case in the 1970s with respect to ‘‘whole
ecosystem’’ modeling [57], scaling up to the
‘‘whole cell’’ level appears most feasible for
energy flow models [58–61], while large-
scale kinetic models often suffer from the
‘‘parameter curse’’. (The parameter curse
was known in the 1970s as the ‘‘Loch Ness
monster syndrome’’ after the existence of
the creature was ‘‘proven’’ through popu-
lation modeling showing that a large super-
predator was apparently missing.) One way
out of this dilemma might be to use
evolutionary models [62].
Individual-based (agent-based) bottom-
up modeling is still rare, but the detailed
agent-based models of cell division [63]
and locomotion [64] of Odell and co-
workers are promising examples. The
latter paper contains a nice discussion
contrasting such detailed modeling with
much simpler models that might equally fit
the data (even if possibly for the wrong
reasons), stressing that the power of such
detailed models is to reveal novel counter-
intuitive consequences of the modeled
interactions, as well as the surprising
bonus that if detailed local interactions
are modeled, robustness with respect to
parameter choice often ensues.
So what about the long-term goals we
set for bioinformatics in the 1970s, i.e.,
what of the ‘‘horse part’’ and the ‘‘ele-
phant’’? Some progress has been made in
modeling morphogenesis in a strict sense
(the ‘‘horse part’’), through the use of cell-
based models that incorporate some of the
physical properties of cells [65]. In partic-
ular, the simple but biophysically reason-
able representation of a cell in the CPM
modeling formalism [66,67] allows the
scaling up to ‘‘computing an organism’’
[68] (e.g., the life cycle of Dictyostelium
[69,70]). But, as Segel emphasized, ‘‘the
importance of linking changing gene
expression with cell movement means that
this achievement (i.e., computing an orga-
nism) is not the beginning of the end but
rather the end of the beginning’’ [68].
Indeed, there lies the current challenge.
Constructive models of evolution (‘‘the
elephant’’) have progressed from studies on
the evolutionary consequences of non-linear
‘‘physical’ genotype/phenotype mapping
as exemplified by RNA folding [71–74] to
the evolved genotype/phenotype mapping
in the form of metabolic networks [75,76],
regulatory networks [77–80], and chromo-
some organization [81–83], and in ‘‘virtual
cells’’ [84,85]. These models shed light on
the evolution of robustness and evolvability,
and the interplay between neutrality and
selection. Interestingly, the surprisingly
large gene content of common ancestors
as inferred from phylogenetic analysis of
fully sequenced genomes and the major role
of gene loss in the differentiation of lineages
(cf. [86]) appear to be ‘‘normal’’ features in
constructive models of evolution (T. Cuy-
pers and P. Hogeweg, unpublished data;
[87]). A general conclusion that can be
drawn from these studies is that the multi-
level nature of biological systems makes the
evolutionary process through mutation and
selection ‘‘easier’’ because of self-organiza-
tion at many levels. However, here again
the outstanding challenge is the closer
integration of what does evolve in the
models to what did evolve in nature, as
gleaned from the bioinformatic analysis of
genomic data.
As I am writing this, a video of Nobel
laureate Paul Nurse has been posted in the
science supplement of the Guardian news-
paper [88]. Emphasizing self-organization
and the resulting counterintuitive results,
he argues that the next ‘‘quantum leap’’ in
biology will come through studying infor-
mation processing in biological systems. I
conclude by asserting that, whether bioin-
formatics in the wider sense of studying
information processing in biotic systems is
a quirk or a quantum leap, it is certainly a
mighty interesting quest!
Acknowledgments
Foremost I thank Ben Hesper for conceiving
and developing with me the concept ‘‘bioinfor-
matics’’. I thank Jaap Heringa for his courage in
becoming the first graduate in ‘‘bioinformatics’’
in 1984. I thank Rob de Boer for tackling the
challenging complexity of immune systems as
information processing systems, as well as all
others who helped me develop bioinformatics in
whatever sense of the word.
PLoS Computational Biology | www.ploscompbiol.org 3 March 2011 | Volume 7 | Issue 3 | e1002021
References
1. Hogeweg P, Hesper B (1978) Interactive instruc-
tion on population interactions. Comput Biol
Med 8: 319–327.
2. Hogeweg P (1978) Simulating the growth of
cellular forms. Simulation 31: 90–96.
3. Hesper B, Hogeweg P (1970) Bioinformatica: een
werkconcept. Kameleon 1(6): 28–29. (In Dutch.)
Leiden: Leidse Biologen Club.
4. Nurse P (2008) Life, logic and infor mation.
Nature 454: 424–426.
5. Szekely M (1980) From deoxyribonuclei c acid to
protein: transfer of genetic information. Wiley.
6. Wagner A (2007) From bit to it: How a complex
metabolic network transforms information into
living matter. BMC Sys Biol 1: 33.
7. Thomson M, Gunawardena J (2009) Unlimited
multistability in multisite phosphorylation sys-
tems. Nature 460: 274–277.
8. Turner B (2002) Cellular memory and the histone
code. Cell 111: 285–291.
9. Von Bertalanffy L (1950) An outline of general
system theory. Br J Philos Sci 1: 134–165.
10. Von Bertalanffy L (1973) General system theory.
New York: George Braziller.
11. Waddington CH (1968–1972) Towards a theore-
tical biology. Volumes 1–4. Edinburgh: Edinburgh
University Press.
12. Kauffman S (1969) Metabolic stability and
epigenesis in randomly constructed genetic nets.
J Theor Biol 22: 437–467.
13. Huang S, Ingber D (2000) Shape-dependent
control of cell growth, differentiation, and apop-
tosis: switching between attractors in cell regula-
tory networks. Exp Cell Res 261: 91–103.
14. Huang S, Eichler G, Bar-Yam Y, Ingber D (2005)
Cell fates as high-dimensional attractor states of a
complex gene regulatory network. Phys Rev Lett
94: 128701.
15. Boiteux A, Goldbeter A, Hess B (1975) Control of
oscillating glycolysis of yeast by stochastic,
periodic, and steady source of substrate: a model
and experimental study. Proc Natl Acad Sci U S A
72: 3829–3833.
16. Goodwin B (1963) Temporal organization in
cells: a dynamic theory of cellular control
processes. London: Academic Press.
17. Wolpert L (1969) Positional information and the
spatial pattern of cellular differentiation. J Theor
Biol 25: 1–47.
18. Griffith J (1968) Mathematics of cellular control
processes II. Positive feedba ck to one gene.
J Theor Biol 20: 209–216.
19. Turing A (1952) The chemical basis of morpho-
genesis. Philos Trans R Soc Lond B Biol Sci 237:
37.
20. Crick F (1970) Diffusion in embryogenesis.
Nature 225: 420–422.
21. Rosenblatt F (1962) Principles of neurodynamics:
perceptrons and the theory of brain mechanisms.
Washington (D.C.): Spartan Books.
22. Minsky M, Papert S (1969) Perceptrons. Cam-
bridge (Massachusetts): MIT Press.
23. Holland J (1975) Adaptation in natural and
artificial system: an introduction with application
to biology, control and artificial intelligence. Ann
Arbor (Michigan): University of Michigan Press.
24. Hewitt C (1977) Viewing control structures as
patterns of passing messages. Artificial Intelli-
gence 8: 323–364.
25. Abelson H, DiSessa A (1986) Turtle geometry:
the computer as a medium for exploring math-
ematics. Cambridge (Massachusetts): MIT Press.
26. Papert S (1993) Mindstorms: children, computers,
and powerful ideas. New York: Basic Books.
27. Sneath P, Sokal R (1972) Numerical taxonomy:
the principles and practice of numerical classifi-
cation. San Francisco: Freeman. xvi, 573 p.
28. Lance G, Williams W (1966) A generalized
sorting strategy f or computer classifications.
Nature 212: 218.
29. Macnaughton-Smith P, Williams W, Dale M,
Mockett L (1964) Dissimilarity analysis: a new
technique of hierarchical sub-division. Nature
202: 1034–1035.
30. Hogeweg P (1976) Topics in biological pattern
analysis [PhD thesis] Faculty of Science, Univer-
sity of Utrecht.
31. Rosen R (1983) Dynamical modelling of genetic
and epigenetic control. In: Bellmann K, ed.
Modelling and simulation of molecular genetic
information systems. Berlin: Akademie Verlag. pp
17–30.
32. Hagen J (2000) The origins of bioinformatics. Nat
Rev Genet 1: 231–236.
33. Dayhoff M, E ck R (1968) Atla s of prot ein
sequence and structure 1967–1968. Maryland
(Silver Spring): National Biomedical Research
Foundation.
34. Dayhoff M (1978) Atlas of protein sequence and
structure. Volume 5. Washington (D.C.): Nation-
al Biomedical Research Foundation.
35. Hogeweg P, Hesper B (1974) A model study on
biomorphological description. Pattern Recognit
6: 165–179.
36. Lindenmayer A (1968) Mathematical models for
cellular interactions in development I. Filaments
with one-sided inputs. J Theor Biol 18: 280–299.
37. Lindenmayer A (1968) Mathematical models for
cellular interactions in development II. Simple
and branching filaments with two-sided inputs.
J Theor Biol 18: 300–315.
38. Hogeweg P (1976) Iterative character weighing in
numerical taxonomy. Comput B iol Med 6:
199–211.
39. Mastenbroek O, Hogeweg P, Heringa J,
Niemann G, van Nigtevecht G, et al. (1984)
Isozyme variation in Silene pratensis: a response
to different environments. Biochem Syst Ecol 12:
29–36.
40. May R (1974) Biological populations with non-
overlapping generations: stable points, stable
cycles, and chaos. Science 186: 645–647.
41. Gillespie D (1977) Exact stochastic simulation of
coupled chemical reactions. J Phys Chem 81:
2340–2361.
42. Hogeweg P (198 8) Cellular automata as a
paradigm for ecological modeling. Appl Math
Comput 27: 81–100.
43. Boerlijst M, Hogeweg P (1991) Spiral wave
structure in pre-biotic evolution: hypercycles
stable against parasites. Physica D: Nonlinear
Phenomena 48: 17–28.
44. Hogeweg P, Hesper B (1986) Knowledge seeking
in variable str uct ure models. In: Elzas MS,
Oren TI, Zeigler P, eds. Simulation in the
artificial intellige nce era. Amsterd am: North
Holland. pp 227–243.
45. Hogewe g P, Hesper B (198 9) An adaptive,
selfmodifying, non goal directed modelling meth-
odology. In: Elzas MS, Oren TI, Zeigler BP, eds.
Knowledge systems paradigms. Amsterdam:
North Holland. pp 77–92.
46. Honk C, Hogeweg P (1981) The ontogeny of the
social structure in a captive Bombus terrestris
colony. Behav Ecol Sociobiol 9: 111–119.
47. Hogeweg P, Hesper B (1983) The ontog eny of the
interaction structure in bumble bee colonies: a
MIRROR model. Behav Ecol Sociobiol 12:
271–283.
48. Hogeweg P, Hesper B (1985) Socioinformatic
processes: MIRROR modelling methodology.
J Theor Biol 113: 311–330.
49. Hogeweg P, Hesper B (1984) The alignment of
sets of sequences and the construction of phyletic
trees: an integrated method. J Mol Evol 20:
175–186.
50. Konings D, Hogeweg P, Hesper B (1987)
Evolution of the primary and secondary struc-
tures of the E1a mRNAs of the adenovirus. Mol
Biol Evol 4: 300–314.
51. Huynen M, Konings D, Hogeweg P (1992) Equal
G and C contents in histone genes indicate
selection pressures on mRNA secondary struc-
ture. J Mol Evol 34: 280–291.
52. Stoletzki N (2008) Conflicting selection pressures
on synonymous codon use in yeast suggest
selection on mRNA secondary structures. BMC
Evol Biol 8: 224.
53. Konings D, Van Duijn L, Voorma H, Hogeweg P
(1987) Minimal energy foldings of eukaryotic
mRNAs form a separate leader domain. J Theor
Biol 127: 63–78.
54. Kozak M (2005) Regulation of translation via
mRNA structure in prokaryotes and eukaryotes.
Gene 361: 13–37.
55. Gu W, Zhou T, Wilke C (2010) A universal trend
of reduced mRNA stability near the translation-
initiation site in prokaryotes and eukaryotes.
PLoS Comput Biol 6: e1000664. doi:10.1371/
journal.pcbi.1000664.
56. Kertesz M, Wan Y, Mazor E, Rinn J, Nutter R,
et al. Genome-wide measurement of RNA
secondary structure in yeast. Nature 467:
103–107.
57. Odum EP (1968) Energy flow in ecosystems: a
historical review. Integr Comp Biol 8: 11–18.
58. Varma A, Palsson B (1994) Metabolic flux
balancing: basic concepts, scientific and practical
use. Nat Biotechnol 12: 994–998.
59. Covert M, Knight E, Reed J, Herrgard M,
Palsson B (2004) Integrating high-throughput and
computational data elucidates bacterial networks.
Nature 429: 92–96.
60. Pa´l C, Papp B, Lercher M, Csermely P, Oliver S,
et al. (2006) Chance and necessity in the evolution
of minimal metabolic networks. Nature 440:
667–670.
61. Freilich S, Kreime r A, Bor enstein E, Gophna U,
Sharan R, et al. (2010) Decoupling environment-
dependent and independent genetic robustness
across bacterial species. PLoS Comp Biol 6:
e1000690. doi:10.1371/journal.pcbi.1000690.
62. Van Hoek M, Hogeweg P (2006) In silico evolved
lac operons exhibit bistability for artificial induc-
ers, but not for lactose. Biophys J 91: 2833–2843.
63. Odell G, Foe V (2008) An agent-based model
contrasts opposite effects of dynamic and stable
microtubules on cleavage furrow positioning.
J Cell Biol 183: 471–483.
64. Rafelski S, Alberts J, Odell G, Goodson H (2009)
An experimental and computational study of the
effect of ActA polarity on the speed of Listeria
monocytogenes actin-based motility. PLoS Com-
put Biol 5: e1000434. doi:10.1371/journal.
pcbi.1000434.
65. Anderson A, Chaplain M, Rejniak K, Fozard J
(2008) Single-cell-based models in biology and
medicine. Basel: Birkhauser Verlag.
66. Graner F, Glazier J (1992) Simulation of
biological cell sorting using a two-dimensional
extended Potts model. Phys Rev Lett 69:
2013–2016.
67. Mare´e A, Grieneisen V, Hogeweg P (2007) The
Cellular Potts Model and biophysical properties
of cells, tissues and morphogenesis. In:
Anderson A, Rejniak K, eds. Single-cell-based
models in biology and medicine. Basel: Birkhau-
ser Verlag. pp 107–136.
68. Segel L (2001) Computing an organism. Proc
Natl Acad Sci U S A 98: 3639–3640.
69. Savill N, Hogeweg P (1997) Modelling morpho-
genesis: from single cells to crawling slugs. J Theor
Biol 184: 229–235.
70. Mare´e A, Hogeweg P (2001) How amoeboids self-
organize into a fruiting body: multicellular
coordination in Dictyostelium discoideum. Proc
Natl Acad Sci U S A 98: 3879–3883.
71. Schuster P, Fontana W, Stadler P, Hofacker I
(1994) From sequences to shapes and back: a case
study in RNA secondary structures. Proc Biol Sci
255: 279–284.
72. Huynen M, Sta dler P, Fontana W (1996)
Smoothness within ruggedness: the role of
neutrality in adaptation. Proc Natl Acad
Sci U S A 93: 397–401.
PLoS Computational Biology | www.ploscompbiol.org 4 March 2011 | Volume 7 | Issue 3 | e1002021
73. van Nimwegen E, Crutchfield J, Huynen M
(1999) Neutral evolution of mutational robustness.
Proc Natl Acad Sci U S A 96: 9716–9720.
74. Huynen M (1996) Exploring phenotype space
through neutral evolut ion. J Mol Evol 43:
165–169.
75. Kacser H, Beeby R (1984) Evoluti on of catalytic
proteins or on the origin of enzyme species by
means of natural selection. J Mol Evol 20: 38–51.
76. Soyer O, Pfeiffer T (2010) Evolution under
fluctuating environments explains observed ro-
bustness in metabolic networks. PLoS Comput
Biol 6: e1000907. doi:10.1371/journal.
pcbi.1000907.
77. Crombach A, Hogeweg P (2008) Evolution of
evolvability in gene regulatory networks. PLoS
Comput Biol 4: e1000112. doi:10.1371/journal.
pcbi.1000112.
78. Draghi J, Wagner G (2009) The evolutionary
dynamics of evolvability in a gene network model.
J Evol Biol 22: 599–611.
79. Wagner A (2008) Robustness and evolvability: a
paradox resolved. Proc Biol Sci 275: 91–100.
80. Draghi J, Parsons T, Wagner G, Plotkin J (2010)
Mutational robustness can facilitate adaptation.
Nature 463: 353–355.
81. Crombach A, Hogeweg P (2007) Chromosome
rearrangements and the evolution of genome
structuring and adaptability. Mol Biol Evol 24:
1130–1139.
82. Hurst L, Pa´l C, Lercher M (2004) The evolu-
tionary dynamics of eukaryotic gene order. Nat
Rev Genet 5: 299–310.
83. Batada N, Hurst L (2007) Evolution of chromo-
some organization driven by selection for reduced
gene expression noise. Nat Genet 39: 945–949.
84. Neyfak h A, Baranova N, Mizrokhi L (2006) A
system for studying evolution of life-like virtual
organisms. Biol Direct 1: 23.
85. Goldstein R, Soyer O (2008) Evolution of taxis
responses in virtual bacteria: non-adaptive dy-
namics. PLoS Comput Biol 4: e1000084.
doi:10.1371/journal.pcbi.1000084.
86. Koonin E (2007) The Biological Big Bang model
for the major transitions in evolution. Biol Direct
2: 21.
87. de Boer FK, Hogeweg P (2010) Eco-evolutionary
dynamics, coding structure and the information
threshold. BMC Evol Biol 10: 361.
88. The Guardian (12 November 2010) Sir Paul
Nurse: organisms are information networks
http://www.guardian.co.uk/science/video/2010/
nov/05/paul-nurse-life-info rmation- networks
[video]. Accessed 28 February 2011.
PLoS Computational Biology | www.ploscompbiol.org 5 March 2011 | Volume 7 | Issue 3 | e1002021
... The baggage of tools that bioinformatics provides facilitates the understanding of the biological and genetic processes that underlie the selection process. As Pauline Hogeweg, one of the promoters of BI states, the term was used for the first time by her and Ben Hesper, in the beginning of 1970s and defining "the study of informatics processes in biotic systems" [39,40]. Over the past years, advanced research in the field of biology, genetics, but also artificial intelligence, has led to the reconfiguration and development of the meaning of BI [38,40,41]. ...
... As Pauline Hogeweg, one of the promoters of BI states, the term was used for the first time by her and Ben Hesper, in the beginning of 1970s and defining "the study of informatics processes in biotic systems" [39,40]. Over the past years, advanced research in the field of biology, genetics, but also artificial intelligence, has led to the reconfiguration and development of the meaning of BI [38,40,41]. At present, evolutionary biology goes 'hand-in-hand' with bioinformatics, due to the fact that the first provides the scientific data and the second manages and analyzes them [42]. ...
Article
Full-text available
Maize will continue to expand and diversify as an industrial resource and a feed and fuel crop in the near future. The United Nations estimate that in 2050 the global population will reach 9.7 billion people. In this context, food security is increasingly being discussed. Additionally, another threat to food security is global warming. It is predicted that both the quantity and the quality of crops will be seriously affected by climate change in the near future. Scientists and breeders need to speed up the process of creating new maize cultivars that are resistant to climate stress without diminishing yield or quality. The present paper provides a brief overview of some of the most important genomics tools that can be used to develop high-performance and well-adapted hybrids of maize and also emphasizes the contribution of bioinformatics to an advanced maize breeding. Genomics tools are essential for a precise, fast, and efficient breeding of crops especially in the context of climate challenges. Maize breeders are able now to develop new cultivars with better traits more easily as a result of the new genomic approaches, either by aiding the selection process or by increasing the available diversity through precision breeding procedures. Furthermore, the use of genomic tools may in the future represent a way to accelerate the processes of de novo domestication of the species.
... Due to its popularity, Waling-waling has been over-collected and is rare in nature because of habitat destruction. Waling-waling is listed as endangered in Appendix II of the Convention on International Trade in Endangered Species (CITES) and cannot be exported worldwide (Hogeweg 2011). There is a great need to conserve this species for future generations to know the true value of one of Philippines' National Heritage. ...
Article
Full-text available
Vanda sanderiana is one of the most popular and highly prized Philippine orchids locally known as "Waling-waling". This orchid is a commercially important species for it is a frequently selected parent in the hybridization and production of modern vandaceous hybrids. This study aimed to develop the DNA fingerprint profile of 25 Vanda sanderiana accessions mostly originated from Mindanao using 52 publicly available orchid simple sequence repeat (SSR) markers. Twenty-three screened SSR markers produced polymorphic band profiles. The number of banding patterns observed ranged from 2 to 12 with molecular band sizes ranged from 95 bp to 465 bp. The IPS 13 was a highly informative marker as it exhibited the highest number of banding patterns (BP) and had the most unique bands. The polymorphism information content (PIC) varied from 0.365 to 0.884 with an average of 0.705. Fifteen of the polymorphic SSR markers were able to generate a unique banding patterns (BP) that could distinguish 20 out of 25 Vanda sanderiana genotypes. Fingerprints of the accessions were established based on the BP of the ten highly polymorphic markers with a range of PIC values from 0.75 to 0.88. Dendrogram generated based on 117 alleles detected by the 23 markers clustered the accessions according to flower color and place of origin. Cluster analysis using the UPGMA method separated the pink-maroon types from white apple green forms of V. sanderiana. A medium level of genetic diversity was detected in SSR data (50%), indicating SSR markers are effective in measuring genetic diversity and portraying genetic relationships among the genotypes in the germplasm. The present investigation suggests the usefulness of the employed SSR markers in DNA fingerprinting for genotype identification, discrimination, genetic diversity, and selection of suitable parents for future breeding work on this Philippine orchid. The findings of this preliminary study are valuable references to test the authenticity of V. sanderiana genetic resources in the country.
... Yet evolutionary theory has benefited from a complementary "bottom-up" approach, where evolution is simulated explicitly to ask: which behaviors might we expect to evolve from known or assumed basic interactions? [14,32] Here, we apply this bottom-up approach to the problem of T-cell search. Examining which migration patterns emerge spontaneously from the cell's migration machinery and/or the environment, we ask to what extent cells might still evolve or tune search strategies within those constraints. ...
Article
Full-text available
Two decades of in vivo imaging have revealed how diverse T-cell motion patterns can be. Such recordings have sparked the notion of search “strategies”: T cells may have evolved ways to search for antigen efficiently depending on the task at hand. Mathematical models have indeed confirmed that several observed T-cell migration patterns resemble a theoretical optimum; for example, frequent turning, stop-and-go motion, or alternating short and long motile runs have all been interpreted as deliberately tuned behaviours, optimising the cell’s chance of finding antigen. But the same behaviours could also arise simply because T cells cannot follow a straight, regular path through the tight spaces they navigate. Even if T cells do follow a theoretically optimal pattern, the question remains: which parts of that pattern have truly been evolved for search, and which merely reflect constraints from the cell’s migration machinery and surroundings? We here employ an approach from the field of evolutionary biology to examine how cells might evolve search strategies under realistic constraints. Using a cellular Potts model (CPM), where motion arises from intracellular dynamics interacting with cell shape and a constraining environment, we simulate evolutionary optimization of a simple task: explore as much area as possible. We find that our simulated cells indeed evolve their motility patterns. But the evolved behaviors are not shaped solely by what is functionally optimal; importantly, they also reflect mechanistic constraints. Cells in our model evolve several motility characteristics previously attributed to search optimisation—even though these features are not beneficial for the task given here. Our results stress that search patterns may evolve for other reasons than being “optimal”. In part, they may be the inevitable side effects of interactions between cell shape, intracellular dynamics, and the diverse environments T cells face in vivo .
... Although all sorts of modeling, structure, and pattern analysis were considered distinct endeavors, we felt that for bioinformatics research, they were both desired and should be united to compare the outcome of such models with "real" data [49,50]. ...
Article
Full-text available
SARS-CoV-2 has a single-stranded RNA genome (+ssRNA), and synthesizes structural and non-structural proteins (nsps). All 16 nsp are synthesized from the ORF1a, and ORF1b regions associated with different life cycle preprocesses, including replication. The regions of ORF1a synthesizes nsp1 to 11, and ORF1b synthesizes nsp12 to 16. In this paper, we have predicted the secondary structure conformations, entropy & mountain plots, RNA secondary structure in a linear fashion, and 3D structure of nsp coding genes of the SARS-CoV-2 genome. We have also analyzed the A, T, G, C, A+T, and G+C contents, GC-profiling of these genes, showing the range of the GC content from 34.23 to 48.52%. We have observed that the GC-profile value of the nsp coding genomic regions was less (about 0.375) compared to the whole genome (about 0.38). Additionally, druggable pockets were identified from the secondary structure-guided 3D structural conformations. For secondary structure generation of all the nsp coding genes (nsp 1-16), we used a recent algorithm-based tool (deep learning-based) along with the conventional algorithms (centroid and MFE-based) to develop secondary structural conformations, and we found stem-loop, multi-branch loop, pseudoknot, and the bulge structural components, etc. The 3D model shows bound and unbound forms, branched structures, duplex structures, three-way junctions, four-way junctions, etc. Finally, we identified binding pockets of nsp coding genes which will help as a fundamental resource for future researchers to develop RNA-targeted therapeutics using the druggable genome.
... Computers are essential components of these scientific advancements, as they play a crucial role in research and development sectors and become a major tool for researchers. In the era of "omics, " we can easily handle big data using computers, but the term "bioinformatics" was not introduced until the beginning of the 1970s by Hogeweg and Ben Hesper, when DNA could not yet be sequenced (9,10). DNA's role as genetic material was also a matter of debate before 1952. ...
Article
Full-text available
Having played important roles in human growth and development, livestock animals are regarded as integral parts of society. However, industrialization has depleted natural resources and exacerbated climate change worldwide, spurring the emergence of various diseases that reduce livestock productivity. Meanwhile, a growing human population demands sufficient food to meet their needs, necessitating innovations in veterinary sciences that increase productivity both quantitatively and qualitatively. We have been able to address various challenges facing veterinary and farm systems with new scientific and technological advances, which might open new opportunities for research. Recent breakthroughs in multi-omics platforms have produced a wealth of genetic and genomic data for livestock that must be converted into knowledge for breeding, disease prevention and management, productivity, and sustainability. Vetinformatics is regarded as a new bioinformatics research concept or approach that is revolutionizing the field of veterinary science. It employs an interdisciplinary approach to understand the complex molecular mechanisms of animal systems in order to expedite veterinary research, ensuring food and nutritional security. This review article highlights the background, recent advances, challenges, opportunities, and application of vetinformatics for quality veterinary services.
... Utilizing computer technology to gather, store, analyze, and share biological data and information is referred to as bioinformatics, which is related to genetics and genomics (1,2). Biological queries have been analyzed in silico utilizing computational and statistical methods and bioinformatics. ...
Article
Full-text available
Context: Despite major advancements in the field, the current neurosurgical practice requires an interdisciplinary approach. It is known that surgical practice and other cancer-eliminating treatments can be combined for optimal results. However, recent attempts have failed to address many debilitating conditions, indicating an emergent need for novel interdisciplinary therapeutic approaches. Evidence Acquisition: We searched PubMed and Google Scholar for the keywords “immunoinformatics,” “in silico,” “neurology,” and “neurosurgery.” Without time restriction. Results: The immune system is versatile because it is involved in physiological brain function and affects the course of central nervous system (CNS) disease and infection. A novel approach combines neurosurgery and immunoinformatics for optimal results. For instance, brain tumors, such as glioblastoma multiforme (GBM), are still associated with a severely reduced survival of patients, and resection of tumors may provide little help. In silico approaches could help to identify molecular pathways and design immunotherapies for such conditions at a significantly increased speed compared to traditional vaccinology approaches. Conclusions: The neurosurgical practice could be affected by different infectious organisms. These organisms can be targeted by in silico vaccinology techniques. Here, we provide a brief overview of bioinformatics/immunoinformatics and discuss the possible role of immunoinformatics in neurosurgery. In light of the current Coronavirus disease-2019 (COVID-19) epidemic, projections for future studies are also included.
Article
A quiet revolution in genetics is increasingly rendering our milieu strange and artificial. Epigenomics, informatic cousin of epigenetics, is a xenoforming process, giving birth to an alien milieu, replacing the natural with the technical. If epigenetics is understood as the heritable changes in gene expression that do not alter DNA sequence, epigenomics takes as object the set of epigenetic modifications. Environmental, social, even political aspects of life’s variability are re-understood digitally in epigenomic profiles, the previous categories computationally accounted for as potential triggers of epigenesis. Following Gilbert Simondon, the xenoforming procedures of epigenomics can be understood as the concretization and adaptation processes of a technical object, the invention of which gives birth to a technogeographic milieu. In this article, the author examines Simondon’s work, especially, ‘On the Mode of Existence of Technical Objects’, alongside contemporary scholarship on epigenetics.
Chapter
Genomics and genome technology is having, and continues to have, a major impact on all areas of bioscience research providing insights into the key area of molecular mechanisms of cells in health and disease. This is causing a profound effect on biomedical science and is accelerating the development of new diagnostic applications. This book provides a timely, graduate level introduction to the fast-paced area of genomics and clinical diagnostic technologies and introduces the concept of applications based on this area. The initial chapters focus on principal molecular technologies that underpin the information in the later chapters. In addition to introductory areas of nucleic acids and techniques in molecular biology, bioinformatics and proteomics, other key diagnostic areas such as the use of immunological reagents are covered. The later chapters provide more specialised examples of currently used diagnostic technologies and insights into selected key diagnostic challenges including specific examples of molecular microbial diagnostics and molecular biomarkers in oncology. The running themes through the chapters provides an insight into current and future perspectives in this rapidly evolving field.
Article
Full-text available
In the Meno, Phaedo, and Phaedrus, Plato outlines the controversial thesis of a priori knowledge that all learning is a form of recollection—anamnesis. He uses this as an argument for the immortality of the soul via reincarnation. Because of this latter claim, the thesis is widely mocked by contemporary evolutionarily-informed materialists. But we can safely reject the metaphysical claim without abandoning the insight of the epistemological one. And indeed, modern evolutionary theory can explain how learning—at least of the sort that depends on certain a priori concepts—can be a kind of recollection. Through this metaphor, natural selection is a process by which information about the world is transmitted across time. When we learn by reasoning about a priori knowledge, then, we in an important sense rely on information in our genomes—if not our souls—information acquired by the process of natural selection—if not conscious acquisition. Thinking of a priori knowledge with the metaphor of anamnesis elucidates two essential features of the relationship between epistemology and ontology. First, it emphasizes that there is necessarily a time-delay between our a priori knowledge and the universe to which it bears a relationship, if any. Second, it clarifies that a priori knowledge is knowledge that enhances reproductive fitness—which could well be because it reflects ontology faithfully, but could as easily be a kind of innate nominalism.
Article
Synbiotic is defined as the dietary mixture that comprises both probiotic microorganisms and prebiotic substrates. The concept has been steadily gaining attention owing to the rising recognition of probiotic, prebiotics, and gut health. Among prebiotic substances, oligosaccharides demonstrated considerable health beneficial effects in varieties of food products and their combination with probiotics have been subjected to full range of evaluations. This review delineated the landscape of studies using microbial cultures, cell lines, animal model, and human subjects to explore the functional properties and host impacts of these combinations. Overall, the results suggested that these combinations possess respective metabolic properties that could facilitate beneficial activities therefore could be employed as dietary interventions for human health improvement and therapeutic purposes. However, uncertainties, such as applicational practicalities, underutilized analytical tools, contradictory results in studies, unclear mechanisms, and legislation hurdles, still challenges the broad utilization of these combinations. Future studies to address these issues may not only advance current knowledge on probiotic-prebiotic-host interrelationship but also promote respective applications in food and nutrition.
Article
It is suggested that a system of chemical substances, called morphogens, reacting together and diffusing through a tissue, is adequate to account for the main phenomena of morphogenesis. Such a system, although it may originally be quite homogeneous, may later develop a pattern or structure due to an instability of the homogeneous equilibrium, which is triggered off by random disturbances. Such reaction-diffusion systems are considered in some detail in the case of an isolated ring of cells, a mathematically convenient, though biologically unusual system. The investigation is chiefly concerned with the onset of instability. It is found that there are six essentially different forms which this may take. In the most interesting form stationary waves appear on the ring. It is suggested that this might account, for instance, for the tentacle patterns on Hydra and for whorled leaves. A system of reactions and diffusion on a sphere is also considered. Such a system appears to account for gastrulation. Another reaction system in two dimensions gives rise to patterns reminiscent of dappling. It is also suggested that stationary waves in two dimensions could account for the phenomena of phyllotaxis. The purpose of this paper is to discuss a possible mechanism by which the genes of a zygote may determine the anatomical structure of the resulting organism. The theory does not make any new hypotheses; it merely suggests that certain well-known physical laws are sufficient to account for many of the facts. The full understanding of the paper requires a good knowledge of mathematics, some biology, and some elementary chemistry. Since readers cannot be expected to be experts in all of these subjects, a number of elementary facts are explained, which can be found in text-books, but whose omission would make the paper difficult reading.
Book
Goodwin, B. C., Temporal Organization in Cells; a Dynamic Theory of Cellular Control Process, London: Academic Press, 1963.