Content uploaded by Jordi GarciaFernandez
Author content
All content in this area was uploaded by Jordi GarciaFernandez
Content may be subject to copyright.
© 2005 Nature Publishing Group
Departament de Genètica,
Facultat de Biologia,
Universitat de Barcelona,
Avinguda Diagonal 645,
08028 Barcelona, España.
e-mail: jordigarcia@ub.edu
doi:10.1038/nrg1723
Published online
10 November 2005
BILATERIANS
Members of the animal
kingdom that have bilateral
symmetry — the property of
having two similar sides, with
definite upper and lower
surfaces, and anterior and
posterior ends. They include
acoelomorphs, protostomes and
deuterostomes.
CNIDARIANS
Radially symmetrical animals
that have sac-like bodies with
only one opening. They include
jellyfish, corals, hydra and
anemones.
METAZOAN
A multicellular animal.
The principal body axes of animals are patterned
early in embryonic development. Part of this process
involves homeobox genes, a group of transcription
factors that contain a 60-amino-acid domain (the
homeodomain). One particular subgroup of homeo-
box genes — the Hox genes — consists of genes that
are physically linked on a chromosome in so-called
‘Hox clusters’. What makes Hox genes special is not
only their organization in chromosomal clusters —
consisting of 8 homeotic genes in the split
Antennapedia–Ultrabithorax (
ANTP–UBX) com-
plex of Drosophila melanogaster and 39 Hox genes
in 4 complexes in mammals — but the striking phe-
nomenon of spatial collinearity. The gene order in
the cluster mimics the order of expression of genes
and their function along the anterior–posterior
(A–P) body axis: genes at the 5′ end of the cluster are
expressed in, and pattern, the posterior part of the
body, whereas genes at the 3′ end pattern the ante-
rior end of the body
1
. In some lineages (mainly
vertebrates), genes within Hox clusters show tempo-
ral collinearity in addition to spatial collinearity:
anterior genes are expressed earlier than posterior
genes
2,3
. Hox clusters are found in nearly all BILATERIAN
animals and probably arose before the
CNIDARIAN–
bilaterian split
BOX 1. The persistence of an intact,
collinear Hox cluster for over 1,000 million years
(Myr) indicates that its organization is evolutionarily
constrained
4
.
In recent years, two close relatives of the Hox cluster
have been found. The first — the ParaHox cluster — is
an evolutionarily close relative, or paralogue, of the
Hox cluster
5
. Both Hox and ParaHox clusters arose by
duplication of an ancestral ProtoHox cluster early in
METAZOAN evolution. The second, more distantly related
family of homeobox genes is called the NK cluster. NK
homeobox genes are evolutionary relatives of both Hox
and ParaHox genes, and are found in clusters in some
lineages. Comparative genomics studies strongly indi-
cate that Hox, ParaHox and NK clusters were originally
neighbours in the genome. There is also evidence that
they constituted a large array of homeobox genes that I
call here the ‘megacluster’, which would have followed
distinct evolutionary pathways in each lineage.
The expression pattern of the three related sets of
homeobox genes is intriguing, as each set seems to cor-
respond broadly with one of the three embryonic germ
layers: Hox genes are expressed in all germ layers, but
predominantly in the neuroectoderm
6
, ParaHox genes
are expressed primarily in endodermal derivatives
5
,
and NK genes are expressed mostly in mesodermal
derivatives
7
. Until recently, claiming that the origin of
the homeobox clusters was related to the origin and
diversification of the three germ layers was puzzling,
as homeobox clusters are present even in simple meta-
zoans, which ‘classically’ lack the third germ layer, the
mesoderm. However, recent findings in cnidarians, and
a critical re-evaluation of the complexity of the body
THE GENESIS AND EVOLUTION OF
HOMEOBOX GENE CLUSTERS
Jordi Garcia-Fernàndez
Abstract | Once called the ‘Rosetta stone’ of developmental biology, the homeobox
continues to fascinate both evolutionary and developmental biologists. The birth of the
homeotic, or Hox, gene cluster, and its subsequent evolution, has been crucial in mediating
the major transitions in metazoan body plan. Comparative genomics studies indicate
that the more recently discovered ParaHox and NK clusters were linked to the Hox cluster
early in evolution, and that together they constituted a ‘megacluster’ of homeobox genes
that conspicuously contributed to body-plan evolution.
NATURE REVIEWS
|
GENETICS VOLUME 6
|
DECEMBER 2005
|
881
FOCUS ON THE BODY PLAN
© 2005 Nature Publishing Group
PA
C
P
3A
C
P
3
A
Cnidarians
Sponges
Choanoflagellates
Acoelomorphs
6/8 9/14
45312
4531 2 9 10111213867
45312 9101112867
4531 2 9 1011121314
14
867
4531 2 9 1011121314867
P1/245312
Antp
Ubx/
abdA
Vertebrates
Cephalochordates
Echinoderms/hemichordates
Protostomes
Deuterostomes
Last eubilaterian ancestor
Multicellularity
Symmetry
Bilateral symmetry
Coelom
Neural crest, vertebrae
? ?
MACRO EVOLUTIONARY
EVENTS
Evolutionary processess that
occur above the species level;
for example, the origin of phyla
and changes in body-plan
organization.
THROUGHGUT
A gut that has two openings: a
mouth and an anus.
TRIPLOBLASTIC
An animal that has three
primary germ layers: the
ectoderm, endoderm and
mesoderm. Triploblasts include
acoelomorphs, protostomes and
deuterostomes.
CHORDATES
The members of a phylum of
animals (the Chordata) that is
characterized by the possession
of a notochord. It includes
urochordates (such as the
ascidians), cephalochordates
(amphioxus) and vertebrates.
PROTOSTOMES
One of the main groups of
bilaterally symmetrical animals.
The name derives from ‘proto’
(first) and ‘stome’ (mouth),
because the first opening of the
embryo (the blastopore)
becomes the definitive mouth.
plan of simpler metazoans, make it timely to propose
an exciting idea: that a megacluster of homeobox genes
evolutionarily appeared together with (and might have
been causal to) the appearance of the three germ lay-
ers, and that the origin of
TRIPLOBLASTY predated the
cnidarian–bilaterian split.
Here I summarize our current knowledge of the
origins and structure of the Hox and ParaHox clusters.
I also describe the structure and function of the third
Hox-like gene cluster, the NK cluster, in
CHORDATES and
PROTOSTOMES, and discuss what we know about the germ-
layer specificity of the three clusters. Finally, I approach
the enigma surrounding the reasons for the clustering
of the homeobox genes. Decades after collinearity was
recognized, the constraints that maintain the clusters
in some lineages but not in others are still not well-
understood. I conclude with suggestions for future
research in the field: in particular, I highlight the need
to identify more model organisms, which should be
chosen on the basis of their key phylogenetic position.
Box 1 | Evolution of the Hox cluster during metazoan evolution
During evolution, large
MACRO EVOLUTIONARY EVENTS
markedly altered the
metazoan body plan and
gave rise to the
morphological diversity and
complexity of current
phyla
78
. The cladogram
shows the main metazoan
groups and the associated
body-plan transitions
(indicated by red circles).
The closest unicellular
relatives of metazoans were
the choanoflagellates
61
; the
question marks indicate
uncertainty about the Hox
gene complement in these
evolutionary positions. The
first body-plan transition in
metazoans was the origin of
radial symmetry, which gave
rise, in the first instance, to
cnidarians. The origin of
bilaterality involved the
generation of two body axes
(anteroposterior and
dorsoventral), the
endomesoderm, and a
nervous system that was
concentrated at the anterior (but see
REF. 79 for alternative views on the origin of the nervous system). Acoelomorphs
(acoel flatworms) are the simplest bilateral representatives
16
. Higher bilaterians (Eubilateria) are protostomes
(arthropods, nematodes, annelids, molluscs and platyhelminthes, among other phyla) and deuterostomes
(hemichordates, echinoderms and chordates). These two groups arose from the last eubilaterian ancestor, which had
coelomic cavities, was segmented, and had a
THROUGHGUT, an excretory system and brain ganglia
16
. Non-vertebrate
chordates (urochordates and cephalochordates such as amphioxus) did not undergo the metazoan transition that
generated the vertebrate lineage, which is characterized by having neural-crest-derived tissues, vertebrae and high brain
complexity. (See main text for alternative views on the origin of bilaterality.)
The figure illustrates one plausible explanation for the current composition of the homeobox (Hox) cluster
in each main metazoan group. Hox genes are divided into four distinct classes: Anterior (A, and the
corresponding paralogous groups 1 and 2, in purple), Group 3 (3, in yellow), Central (C, corresponding to
paralogous groups 4–8 in deuterostomes and groups 4, 5, Antennapedia (Antp) and Ultrabithorax/abdominal A
(Ubx/abdA) in protostomes, in green), and Posterior (P, P1/2 in protostomes, and paralogous groups 9–12 or
9–14 in deuterostomes, in red) genes. Cluster duplications in vertebrates are not indicated for simplicity. Red
arrows indicate the inferred Hox gene composition at relevant morphological transitions. The Hox cluster of the
common ancestor of cnidarians and bilaterians would have contained a member of each gene class. According to
classical views, cnidarians secondarily lost the Group 3 and Central genes, and the origin of bilaterality was not
coincident with an increase in the complexity of the Hox cluster. See
BOX 2 for an alternative hypotheses on the
composition of the primordial Hox cluster.
882
|
DECEMBER 2005
|
VOLUME 6 www.nature.com/reviews/genetics
REVIEWS
© 2005 Nature Publishing Group
DEUTEROSTOMES
The second of the two main
groups of bilaterally
symmetrical animals. The name
derives from ‘deutero’ (second)
and ‘stome’ (mouth), which
refers to the origin of the
definitive mouth as an opening
that is independent from the
blastopore of the embryo.
AMPHIOXUS
The invertebrate that is most
closely related to vertebrates.
The Hox cluster
Hox genes are crucial in positioning body structures
along the A–P axis. How they do so, and which gene
network programmes they regulate are the principal
aims of current research. Based on phylogenetic recon-
structions, Hox genes can be classified into four groups
— Anterior, Group 3, Central and Posterior — accord-
ing to their position in the cluster and site of expression
along the body axis
8–10
BOX 1. Although the origin of
Hox genes cannot be reconstructed with certainty, the
last common ancestor of protostomes and
DEUTEROSTOMES
is thought to have had a single Hox cluster composed
of 7–9 genes. The cluster then expanded to include 8 or
9 members in the various protostome lineages
11
, and
up to 14 in the chordates
12,13
BOX 1. Cluster duplication
followed by gene loss accounts for the 39 Hox genes that
are now seen in the 4 mammalian clusters, or the up to
14 clusters in the fish lineages
14
. The Hox gene com-
position in simpler metazoans was not detected until
recently, after extensive searches by degenerate PCR and
EST screens, and by genome sequencing. Irrespective of
lineage-specific duplications, however, at least 4 genes
were present in the ancestor of the earliest extant sim-
pler bilaterians, the acoelomorph flatworms
15,16
, and
2 genes in the ancestor of cnidarians
17
.
The ParaHox cluster
Several Hox-related genes were traditionally classified
as ‘orphans’: they were related to some Hox para logous
groups more than other paralogous groups were to each
other. This apparent paradox was clarified by Brooke
et al.
5
, who found that three of these orphan genes
were closely linked in the
AMPHIOXUS genome. This new
cluster, which they called ParaHox, is the paralogue
of the Hox cluster. It is so far restricted to chordates,
although the individual genes — caudal-type homeo-
box (
Cdx), genomic screened homeobox (Gsh) and
Xenopus laevis homeobox 8/insulin promoter factor 1
(Xlox/
Ipf1) — are also present, although scattered, in
other deuterostome and protostome genomes
10,18–20
.
Origin of the ParaHox cluster. Most phylogenetic
models
8,10,21,22
indicate that Hox and ParaHox clus-
ters arose before the divergence of cnidarians and
bilaterians
BOX 1 through the duplication of a hypo-
thetical ProtoHox cluster. This cluster would have
consisted of four genes (one Anterior, one Group 3,
one Central and one Posterior). After duplication,
the ParaHox cluster would have lost the central gene,
and the Hox cluster would have expanded centrally
by further cis-duplication events
(FIG. 1). Recently,
however, other theories suggest that the ProtoHox
cluster possessed only two or three genes
BOX 2.
Irrespective of how many genes were contained in the
ProtoHox cluster, the linkage of Hox and ParaHox
genes indicates that the ProtoHox cluster itself origi-
nated through cis-duplications of a founder ProtoHox
gene
(FIG. 1).
Evx and Meox. Hox and ParaHox genes belong to the
ANTP class of homeobox genes, one of the main classes
of animal-specific homeobox genes. The ANTP class
also includes two other genes, even-skipped homeotic
gene (
Evx) and mesenchyme homeobox (Meox; also
known as Mox), which are phylogenetically close to
Hox and ParaHox genes, and are located on either side
of the Hox cluster in vertebrates
23
(FIG. 1). Evx is closely
linked to the 5′ end of a Hox gene even in a cnidarian
species
24
. Recently, three independent phylogenetic
analyses
20,25,26
and further genome-mapping studies
have led to the proposal that Evx and Meox originated
by duplication of an ancestral gene, the Evx/Meox
ancestor (or ProtoMoeve
26,27
), which was adjacent to the
5′ end of the ProtoHox cluster. The ProtoHox cluster
and ProtoMoeve duplicated in cis, leaving Evx 5′ to the
Hox cluster, and Meox 3′ to the ParaHox cluster
(FIG. 1).
Chromosomal breakages on either side of the ParaHox
cluster then left Meox and Evx at opposite ends of the
Hox cluster, and left the ParaHox cluster isolated.
Interestingly, the possibility that Hox and ParaHox
clusters were physically linked is also indicated by
circumstantial evidence: the linkage of Cdx to Hox1 in
the urochordate Oikopleura dioica
28
.
Mnx, Gbx and En. Three other members of the ANTP
class are phylogenetically related to the Hox/ParaHox
genes, and also map close to the Hox cluster in ver-
tebrates and amphioxus
29–31
. This three-gene cluster,
which was given the name EGHBox
30
, consists of motor
neuron restricted (Mnx), gastrulation brain homeobox
(
Gbx) and engrailed (En).
Comparative genomics studies therefore support
the existence of an extended Hox or ParaHox cluster
early in metazoan evolution, which would have con-
tained the gene classes discussed in this section: Mnx,
Gbx, En, Evx, Hox, Meox and ParaHox genes
31
.
The NK cluster
In 1989 Kim and Niremberg reported the presence of
four novel homeobox genes in D. melanogaster
32
, which
they named NK1 to NK4 (for Niremberg and Kim). Of
these, NK1, NK3 and NK4 mapped to the same cyto-
logical location — bands 93D/E on chromosome 3.
NK2 is located on chromosome 1. Subsequently, the
genes were renamed to reflect the known mutational
alleles they represented: NK1 = S59 or slouch (
slou),
NK2 = ventral nervous system defective (
vnd), NK3 =
bagpipe (
bap), and NK4 = muscle-specific homeobox 2
(msh2) or tinman (
tin). Other related homeobox genes
were later found
7,33
near the original cluster of three
NK genes: C15 (also known as 311, 93Bal, Hox11-311),
ladybird early (lbe) and ladybird late (lbl). These last
two genes are recent D. melanogaster duplicates. Jagla
et al.
33
termed this cluster of six genes the 93D/E clus-
ter, in reference to its cytological position on the third
chromosome.
NK genes also have orthologues in chordates,
so the NK cluster itself must have been present in
early bilaterians. Pollard and Holland
30
identified
remnants of four NK clusters in the working draft
of the human genome
(FIG. 2a). Note, however, that
the nomenclature of NK genes has not yet been
NATURE REVIEWS
|
GENETICS VOLUME 6
|
DECEMBER 2005
|
883
FOCUS ON THE BODY PLAN
© 2005 Nature Publishing Group
ProtoHox-like
ProtoHox Evx/Meox ancestor
Ancestral Hox-like cluster
Coupled Hox-like cluster breakage
Coupled Hox-like cluster
Primordial ParaHox cluster + Meox
Primordial ParaHox cluster
ParaHox cluster (origin of vertebrates)
Extended Hox cluster (origin of vertebrates)
Vertebrate duplications Vertebrate duplications
Primordial extended Hox cluster (Meox + primordial Hox + Evx)
Primordial Hox cluster + Evx
MeoxCdx
Xlox
Gsh 12345678910111213 Evx
ParaHox A
ParaHox B
ParaHox C
14
ParaHox D
Hox A
Hox B
Hox C
Hox D
Segmental tandem
duplication of Hox-like cluster
ProtoHox cluster
Evx/Meox ancestor
Cdx
Mammalian clusters Mammalian clusters
FLUORESCENCE IN SITU
HYBRIDIZATION
A technique in which a
fluorescently labelled DNA
probe is used to detect a
particular chromosome or gene
with the help of fluorescence
microscopy.
systematized TABLE 1: for example, NKX2-3,
NKX2-5 and NKX2-6 are human orthologues of
D. melanogaster tin (NK4), and NKX3-1 and NKX3-2
are orthologues of D. melanogaster bap (NK3).
As with the extended Hox and ParaHox clusters,
the historical series of genome duplications, gene
losses and chromosomal rearrangements hinders the
inference of the ancestral, single NK cluster at the ori-
gin of vertebrates. However, Luke et al.
34
have recently
characterized the organization of the NK cluster in
amphioxus by genome cloning and
FLUORESCENCE
IN SITU HYBRIDIZATION
. Interestingly, the NK clusters of
both amphioxus and vertebrates are split at the same
intergenic positions
(FIG. 2a). In chordates, only the
TLX (T-cell leukaemia homeobox)–LBX (ladybird-
related homeobox) and the NK3–NK4 linkages have
been maintained as the ancestral cluster. Further
genome mapping has added new genes to the NK
cluster: muscle-specific homeobox genes (Msx) and
NK5
(FIG. 2a).
Taken together, the data indicate that an NK cluster
of 7 genes (Msx–NK4–NK3–Lbx–Tlx–NK1–NK5) must
have existed in the last common ancestor of bilaterians,
before the protostome–deuterostome split. In sharp
contrast to the Hox and ParaHox clusters, which were
maintained intact in the deuterostome lineage but not
in the protostome lineages that have been analysed
so far, the NK cluster has retained a largely compact
ancestral organization in D. melanogaster and the mos-
quito, but was broken into three pieces in the chordate
lineage.
The ANTP megacluster
From the data reported in the above sections, a full
evolutionary model for the genesis and evolution
of the ANTP-class homeobox genes can be traced
(FIG. 3). Early in metazoan evolution, a ProtoANTP
founder gene generated two genes by cis-duplication
— ProtoHox-like and ProtoNK. Each gene was
amplified by cis-duplication, giving rise to an array
Figure 1 | Genesis and evolution of Hox and ParaHox clusters. A founder ProtoHox-like gene produced, through a
series of cis-duplications, an ancestral Hox-like cluster that consisted of the ProtoHox cluster linked to the ancestor of
even-skipped homeotic gene (Evx) and mesenchyme homeobox (Meox). Segmental tandem duplications generated a
continuous array of primordial ParaHox, Meox, Hox and Evx genes, which was subsequently broken between the posterior
ParaHox gene caudal-type homeobox (Cdx) and the Meox gene (red arrow). Further cis-duplications and evolution by
expansion and genome doublings led to the current mammalian complement of extended Hox and ParaHox genes,
consisting of four clusters of each cluster type (A–D). A further gene cluster, EHGbox (not shown), exists that consists
of gastrulation brain homeobox (Gbx), motor neuron restricted (Mnx) and engrailed (En). This cluster was created by
cis-duplication of a founder gene that was probably adjacent to the ProtoHox gene. Colour codes for the four paralogous
groups are: Anterior, purple; Group 3, yellow; Central, green; Posterior, red. Gsh, genomic screened homeobox; Xlox,
Xenopus laevis homeobox 8.
884
|
DECEMBER 2005
|
VOLUME 6 www.nature.com/reviews/genetics
REVIEWS
© 2005 Nature Publishing Group
CP
3
A
PA
P
A
3
CdxGsh
CCdx
Xlox
Gsh
CdxXloxGsh
Cdx
Xlox
Gsh
4/89/14
3
1/2
4/89/14
3
1/ 2
Cdx
Xlox
Gsh
4/89/14
3
3
1/2
9/141/2
ProtoHox cluster
Duplication of ProtoHox cluster
Duplication of ProtoHox cluster
Duplication of ProtoHox cluster
Independent tandem
duplication of Hox and
ParaHox genes
Primordial
Hox cluster
Primordial
ParaHox
cluster
Primordial
Hox cluster
Primordial
ParaHox
cluster
Three-gene
ParaHox
cluster
Four-gene
Hox cluster
Primordial
Hox cluster
Primordial
ParaHox
cluster
Three-gene
ParaHox
cluster
Four-gene
Hox cluster
9/141/2
a
c
b
Independent tandem
duplication of Hox genes
DIPLOBLASTIC
An animal that has only two
primary germ layers: the
ectoderm and the endoderm.
Diploblasts include the
cnidarians, the ctenophores and
— according to some authors —
the placozoans and the
poriferans.
CAMBRIAN EXPLOSION
The sudden appearance, about
520 million years ago, in the
fossil record of many major
groups (phyla) of bilaterian
animals.
of Hox, ParaHox and NK clusters (FIG. 1). Two chro-
mosomal breakages eventually split the megacluster
into the three current syntenic ParaHox, Hox and NK
regions.
Although no current animal genomes are known
that maintain the original set of ANTP-class genes
in a single array, the phylogenetic and gene-map
reconstruction that is shown in
FIG. 1 illustrates that a
megacluster of 25 homeobox genes, including the Hox,
ParaHox and NK clusters, existed early in animal evolu-
tion. The precise timing of the initial cis-duplications
is unknown. Extended Hox, ParaHox and NK genes
are present in the
DIPLOBLASTIC cnidarians
17,35
. Therefore,
the genesis of the megacluster must have predated the
cnidarian–bilaterian transition. No definite Hox or
ParaHox gene has been described in the simplest
Box 2 | Origin of the ProtoHox cluster: two-, three- and four-gene model
The homeobox (Hox) and
ParaHox (its paralogue)
clusters arose early in
evolution from the
duplication of an ancestral
ProtoHox cluster, but the
number of genes present in
the ProtoHox cluster at the
time of duplication is still
uncertain. The original
report of the ParaHox
cluster
5
proposed a model in
which the ProtoHox cluster
was composed of four genes
(one Anterior (A), one
Group 3 (3), one Central
(C), and one Posterior (P)).
According to this model,
shown in panel
a, this cluster
gave rise to a primordial
Hox cluster of four genes
(paralogous groups 1/2, 3,
4/8, 9/14), and a primordial
ParaHox cluster of four
genes (genomic screened
homeobox (Gsh), Xenopus
laevis homeobox 8/insulin
promoter factor 1 (Xlox/
Ipf1), C and caudal-type
homeobox (Cdx)). The
ParaHox cluster would have
lost the C gene, and the Hox
cluster would have expanded
differently in the various
bilaterian lineages.
An alternative, the three-
gene model, which was also
mentioned in the original
report
5
, was developed soon after
21,22
. According to this view, shown in panel b, the C Hox gene (4/8) originated by
cis-duplication of 3 (as indicated by a blue arrow).
More recently, a two-gene model has been also proposed
27
(panel c), in which Gsh generated the C ParaHox gene
Xlox (blue arrow) by cis-duplication, and the anterior Hox gene (1/2) cis-duplicated sequentially to generate both
3 and 4/8 genes (blue arrows).
Phylogenetic analyses cannot distinguish between these three possibilities, but current data obtained from simpler
metazoans
15–17
favour the two-gene or three-gene models as being the most parsimonious. Cnidarians seem to possess
only anterior and posterior Hox and ParaHox genes (Hox 1/2 and 9/14, and ParaHox Gsh and Cdx; other Hox genes are
probably cnidarian-specific duplicates)
17,27
. In both the two-gene and three- gene models three events are required to
account for the current Hox and ParaHox complement of cnidarians and bilaterians: the generation of Xlox, 3 and 4/8 in
bilaterians (two-gene model), and the generation of 4/8, and the loss of Xlox and 3 in cnidarians (three-gene model).
Phylogenetic analyses support (although weakly) a close relationship between 3 and Xlox, therefore favouring a
three-gene model, whereas the two-gene model fits nicely with the increase in complexity of Hox and ParaHox clusters
that correlates with major metazoan transitions, such as the origin of bilaterians and the
CAMBRIAN EXPLOSION
27
.
NATURE REVIEWS
|
GENETICS VOLUME 6
|
DECEMBER 2005
|
885
FOCUS ON THE BODY PLAN
© 2005 Nature Publishing Group
Msx NK4
NK5 Dr tin bap Ibl
Msx NKx4
NKX2-3
NKX2-5MSX2 TLX3
LBX1 TLX1
NKX2-6 NKX3-1
NKX3-2 MSX1 NKX1-1
LBX2 TLX2
NKX1-2 NKX5-1
NKx3 Lbx Tlx NKx1b NKx1a
Ibe C15 slou
NK3 Lbx Tlx
NK1 NK5
A. gambiae
X
2R
D. melanogaster
Amphioxus
Humans
10q24–26
8p21
2p14
5q34
4p16
a
b
tin
slou
bap Ibe Ibl c15
Mesoderm;
cardioblasts
Visceral
mesoderm
Pericardial cells;
cardioblasts
Alary muscles Somatic musculature;
dorsal vessel
PDA
metazoans (the sponges). However, sponges do have
NK genes — Msx, Tlx, NK3 and NK2
REFS 3638 (which
are currently unmapped). The finding of NK genes in
sponges implies that the common ancestor of sponges
and other metazoans (right at the origin of metazoans)
possessed at least a single Hox-like gene. However,
intensive searches for sponge homeobox genes have
been unsuccessful: Hox genes have either been lost in
the sponge lineages analysed so far or they have escaped
traditional or PCR screens due to sequence divergence.
Genome-sequencing projects might be informative in
this regard.
One cluster per germ layer?
The bilaterian body plan is built on three embryonic
germ layers: ectoderm, endoderm and mesoderm.
Figure 2 | The NK cluster. a | Comparison of the NK homeobox gene clusters of Anopheles gambiae, Drosophila melanogaster,
amphioxus and humans. Colour coding denotes gene orthology. Cluster breaks are denoted by //; intergenic distance is >1 Mb;
///, very large intergenic distance; zigzags, chromosomal transpositions. Local inversions are omitted for simplicity. The originally
described NK cluster of D. melanogaster
7
consisted of six genes (tinman (tin), bagpipe (bap), ladybird early (lbe), ladybird late
(lbl ), C15 and slouch (slou)). New genes have been added to this list: NK5, which is closely linked to NK1 in A. gambiae (human
NKX5-1 is also loosely linked to NKX1-2), and Msx (muscle-specific homeobox), which is also found near the NK cluster in
mosquitoes, amphioxus and humans. By contrast, D. melanogaster NK5 is distant from the fly Msx-homologue, Drop (Dr, or
muscle-specific homeobox (msh)), and is on the ‘wrong’ side of the cluster; this indicates that an inversion occurred that
included the NK cluster, but not NK5. The complete NK cluster in the last common ancestor of protostomes and deuterostomes
(PDA) is thought to have comprised seven genes: Msx, NK4, NK3, Lbx (ladybird-related homeobox), Tlx (T-cell leukaemia
homeobox), NK1, NK5, in this order. b | Gene regulatory interactions and temporal collinearity of the D. melanogaster NK cluster
in mesodermal derivatives. Time goes from left to right. tin is the first gene to be expressed; it appears in all mesodermal cells
and is then restricted to the dorsal mesodermal cells at gastrulation. tin is crucial in specifying the dorsal mesoderm derivatives.
tin then activates its neighbouring NK gene bap in the visceral mesoderm primordia. The remaining NK cluster genes function
after the cardiac, visceral and somatic primordia have been established. Finally, tin is upregulated again, and is restricted to a
sub-population of cardioblast and pericardial cells
7
. The regulatory interactions are not always direct and occur in some, but not
all, mesodermal cells in which the genes are expressed. Green arrows indicate upstream activation. Pink lines indicate inhibitory
function. Panels a and b are modified, with permission, from REF. 80 © (2004) Graham Luke, University of Reading, UK.
886
|
DECEMBER 2005
|
VOLUME 6 www.nature.com/reviews/genetics
REVIEWS
© 2005 Nature Publishing Group
HEMICHORDATES
A phylum of deuterostome
marine worms that comprise
the enteropneust and
pterobranchs.
Comparative expression analyses of Hox, ParaHox
and NK genes show that these gene classes are pre-
dominantly expressed in ectodermal, endodermal
and mesodermal derivatives, in corresponding order.
This raises the intriguing hypothesis that the origin of
the homeobox megacluster was linked to the origin
or patterning of the germ layers. In this section, I
summarize current data on homeobox gene expres-
sion, and qualify the interpretation of this expression
pattern by discussing some recent data that allow us
to revisit the origin of triploblastic animals and of
bilaterality.
Hox gene expression. Undoubtedly, the Hox cluster
has an important role in the A–P patterning of the
body. Nevertheless, the expression and function of
Hox genes in D. melanogaster and chordates is most
evident in ectodermal and neuroectodermal tissues.
Gene expression data from amphioxus indicate that
the primitive role of the Hox cluster in chordates was
to pattern the neuroectoderm, including both the
CNS and the PNS
6
. In addition, Lowe et al.
39
elegantly
described the staggered expression of Hox genes in
HEMICHORDATES, in which Hox genes are expressed
in circular areas in the ectoderm around the entire
animal, with an A–P arrangement that is nearly
identical to that found in chordates.
ParaHox gene expression. In contrast to the largely
ectodermal expression pattern of Hox genes, ParaHox
genes are found mainly in endodermal tissues. In
amphioxus, ParaHox genes also show spatial and tem-
poral collinearity
5
, although to a lesser extent than
Hox genes (J.G.-F. & P.W.H. Holland, unpublished
observations).
The posterior ParaHox gene Cdx is expressed in
caudal tissues and its dominant site of expression is
in the posterior endoderm. For example, in vertebrates,
Cdx genes are expressed in the most caudal region of
the embryo, and Cdx1 and Cdx2/3 are important in the
early processes of intestinal morphogenesis
40
. Similarly,
in flies, the caudal gene is expressed posteriorly
41
.
The central ParaHox gene Xlox (or Ipf1) is also
expressed in the endoderm in chordates, although it
Table 1 | Nomenclature of genes in the NK cluster
Drosophila melanogaster;
FlyBase ID number
Human; GenBank accession number Amphioxus; GenBank
accession number
NK1/slou*
slouch (slou; also known as
NK1, S59 or paired-like 9);
FBgn0002941
NKX1-1 (also known as spinal cord axial homeobox gene 2 (SAX2)); XP172406
NKX1-2 (also known as spinal cord axial homeobox gene 1 (SAX1)); XP061241
NKx1a; AL671989
NKx1b; BFL551450
NK3/bap
bagpipe (bap; also known as
NK3); FBgn0004862
NKX3-1 (also known as bagpipe homeobox (BAX) or NKX3A); NM_006167
NKX3-2 (also known as bagpipe homeobox homologue (BAPX1) or NKX3B); AF009801
NKx3; AL513308
NK4/tin
tinman (tin; also known as
muscle-specific homeobox 2
(msh2) or NK4); FBgn0004110
NKX2-3 (also known as NK-2C); BC025788
NKX2-5 (also known as cardiac-specific homeobox (CSX1) or NK-2E ); AB021133
NKX2-6 (also known as thyroid transcription factor 1 (TITF1) or NK-2); NM_003317
NKx4 (also known as
NKx2-tin); AF032999
NK5/Hmx
H6-like (Hmx); FBgn0014858 NKX5-1 (also known as homeobox H6 family 3); XM_291716
NKX5-2 (also known as homeobox H6 family 2 (HMX2
)); NM_005519
Lbx
‡
ladybird early (lbe; also known as
Hox11-D125); FBgn0011278
ladybird late (lbl; also known
as NK cluster homeobox 4
(NKCH4)); FBgn0008651
Ladybird homeobox homologue 1 (LBX1); NM_006562
Ladybird homeobox homologue 2 (LBX2); AC005041
Lbx; AL672000
Tlx
C15 (also known as 311,
93 Bar-like (93Bal) or Hox11-311);
FBgn0004863
T-cell leukaemia, homeobox 1 (TLX1; also known as T-cell leukaemia, homeobox 3
(TCL3) or HOX11); NM_005521
T-cell leukaemia, homeobox 2 (TLX2; also known as homeobox 11-like 1 (HOX11L1),
neural crest homeobox (NCX ) or enteric neuron homeobox (ENX )); HSAJ2607
T-cell leukaemia, homeobox 3 (TLX3; also known as homeobox 11-like 2 (HOX11L2)
or respiratory neuron homeobox (RNX )); HOS223798
Tlx; BFL551449
Msx
Drop (Dr; also known as 99B,
muscle-specific homeodomain 1
(msh1) or lottchen);
FBgn0000492
Muscle-specific homeobox 1 (MSX1; also known as HOX7 ); M97676
Muscle-specific homeobox 2 (MSX2; also known as HOX8); D26145
Msx; BFL130766
*Amphioxus and human genes, and
‡
Drosophila melanogaster and human genes are not direct orthologues; they arose independently in the different lineages.
Drosophila melanogaster data are from FlyBase; human data are from the Human Genome Organization; and amphioxus data are from REFS 34,80.
NATURE REVIEWS
|
GENETICS VOLUME 6
|
DECEMBER 2005
|
887
FOCUS ON THE BODY PLAN
© 2005 Nature Publishing Group
ProtoANTP
ProtoHox-like ProtoNK
12345
6/8
9/14 Evx Gbx En
Mnx
Dlx Msx
NK4
NK3
Lbx
Tlx
NK1
NK5
Emx NK6
ParaHox
ParaHox genes
(endoderm)
Hox genes
(ectoderm)
Hox NK
ProtoHox cluster
ProtoHox
Evx/Meox
NK genes
(mesoderm)
Step 1
Step 2
Step 3
Step 4
Step 5
Step 6
Step 7
Step 8
Gsh
Xlox
Cdx
Meox
has a more central domain of expression than Cdx (as
predicted from its more central location within the
cluster). Xlox is crucial during embryonic development,
and for the differentiation of the endocrine pancreas
and anterior duodenum
42
. No Xlox homologue exists
in the D. melanogaster genome, but in leeches the Xlox
gene is expressed in midgut tissues
43
.
Bearing in mind the collinearity of the ParaHox
cluster, we would expect to see expression of the
anterior ParaHox gene, Gsh, in anterior endodermal
tissues. Instead, Gsh genes have been detected in the
CNS in vertebrates. In addition, in D. melanogaster,
the Gsh gene (intermediate neuroblast defective, ind)
is expressed in the neuroectoderm
44
. How can we
explain the lack of Gsh expression in the anterior
endoderm? If cluster cohesion were necessary to drive
anterior gut expression in D. melanogaster then the
lack of anterior gut expression could be accounted
for by the deterioration of the ParaHox cluster in this
genus. If the ParaHox cluster originally patterned the
endoderm collinearly — with Gsh around a primi-
tive mouth — then the lack of anterior endodermal
expression of Gsh in vertebrates could be due to the
radical alteration of the anterior gut during deuter-
ostome evolution. It is tempting to predict that in
protostomes that have intact ParaHox clusters Gsh
would be expressed in the anterior endoderm
45
.
Furthermore, a recent report
46
maps the expression
of vertebrate Gsh genes to endodermal derivatives,
and shows that Gsh1 and Gsh2 are expressed in the
mouse adult pancreas.
NK gene expression. The expression of NK cluster genes
has been predominantly studied in D. melanogaster
TABLE 2. An exciting finding is that all six members of
the originally described 93D/E cluster (tin, bap, lbe, lbl,
C15 and slou) participate in early mesodermal pattern-
ing and differentiation in D. melanogaster (reviewed
in
REF. 7). There are many regulatory interactions
between these genes, which, although not arranged
hierarchically, have tin as the common initiator or
modulator
(FIG. 2b). Whether the expression pattern
of NK genes in D. melanogaster is collinear is unclear.
Jagla et al.
7
suggest that NK genes show temporal
collinearity with respect to the order of the genes in
the cluster, in an analogous manner to the Hox and
ParaHox clusters, and that the functional constraints
that have maintained the compact structure of the NK
cluster in D. melanogaster are related to a coordinated
expression that is linked to temporal collinearity and
sequential functional deployment of these genes during
mesodermal development.
Figure 3 | The homeobox megacluster. Model for the genesis of the ANTP-class homeobox megacluster. Early in animal
evolution, an ANTP-class homeobox gene ProtoANTP (step 1) was duplicated, giving rise to a ProtoHox-like gene and a
ProtoNK gene (step 2). Through a series of tandem duplications, each family expanded to form an NK cluster (green) and
a series of Hox-like genes (other colours; steps 3–6), leading eventually (step 7) to the generation of the Hox and ParaHox
clusters, including Evx and Meox (grey), and the extended Hox genes (light blue). The last step (step 8) shows the
predicted composition of the megacluster in the last common ancestor of protostomes and deuterostomes. The arrows
indicate the chromosomal breakages that split the megacluster into at least three pieces, giving three main genomic
blocks that are known as the ParaHox, Hox, and NK regions. The red horizontal bars indicate the limits of the ParaHox,
Hox and NK clusters. Steps 1 to 3 occurred before the radiation of sponges; the temporal order of steps 4 to 6 is unclear;
step 7 might represent the last common ancestor of cnidarians and bilaterians; the labels below step 8 indicate the
preferential place of expression of ParaHox, Hox and NK cluster genes. Cdx, caudal-type homeobox; Dlx, Distal-less
homeobox; Emx, empty-spiracles homologue; En, engrailed; Evx, even-skipped homeotic gene; Gbx, gastrulation
brain homeobox; Gsh, genomic screened homeobox; Lbx, ladybird-related homeobox; Meox, mesenchyme homeobox;
Mnx, motor neuron restricted homeobox; Msx, muscle-specific homeobox; Tlx, T-cell leukaemia homeobox; Xlox/Ipf1,
Xenopus laevis homeobox 8/insulin promoter factor 1; 1–14, Hox paralogous groups 1 to 14.
888
|
DECEMBER 2005
|
VOLUME 6 www.nature.com/reviews/genetics
REVIEWS
© 2005 Nature Publishing Group
PLANULA BLASTOPORE
The opening of the planuloid
larva. A planula is the
swimming larva of cnidarians.
Its ciliated epidermis covers
either a solid or a hollowed-out
mass of endoderm cells. The
blastopore will develop into the
oral opening of the polyp.
In vertebrates the NK cluster is split into three
and, as mentioned above, only two gene pairs are
maintained close together: NK3–NK4 and Lbx–Tlx
(FIG. 2a). Expression data of the many mammalian
NK genes during embryonic development is complex
TABLE 2: all are expressed in mesodermal derivatives,
and some (NK4, Lbx and Tlx) are involved in heart
development
47,48
. Vertebrate NK genes seem not to
have retained collinear expression, although some
authors
49,50
have proposed that combinations of NK4
and NK2 paralogues pattern an anteroventral domain
that includes the mesodermal cardiac field and several
endodermal organs.
Three clusters, three germ layers. In summary, com-
parative genomics and functional and expression
data from protostomes and deuterostomes indicate
a thought-provoking idea: three ANTP-class homeo-
box gene clusters that show a degree of preference,
and collinearity, in the three bilaterian germ layers:
Hox for ectoderm, ParaHox for endoderm, and NK
for mesoderm. It is tempting to speculate that, early
in metazoan evolution, a core array of homeobox
genes duplicated, giving rise to three sets of clus-
tered genes with distinct patterning properties. So,
the deployment of these clusters facilitated the evo-
lutionary origins, or the independent patterning, of
the germ layers.
The origin of triploblasty and bilateral symmetry
Diploblastic cnidarians have Hox, ParaHox and NK
genes, and the common ancestor of cnidarians and
bilaterians must already have possessed the three
clusters. Therefore, if we point to the three clusters
to justify the theory of the origin of the three germ
layers, how can we explain the presence of the three
gene classes (and the clustering) in the last common
ancestor of cnidarians and bilaterians? This ancestor
would have existed before the origin of the mesoderm
and of bilateral symmetry, two major evolutionary
inventions that appeared simultaneously in animal
evolution.
What seems to be a Kafkaesque paradox has been
challenged by the recent work of Finnerty and col-
leagues
17
, whose data on the starlet sea anemone
Nematostella vectensis indicate that it is time to rethink
the origins of bilateral symmetry (whether cnidarians
are diploblastic is also a subject of current debate
51,52
).
Sea anemones belong to the basal cnidarian class
Anthozoa. A cross section of the adult does not show
radial symmetry, but shows a plane of bilateral symme-
try, which is known as the directive axis. This bilateral
symmetry was noted by Stephenson in 1928
REF. 53 and
by Hyman in 1940
REF. 54, but was not included in any
textbook. The Hox genes of N. vectensis are expressed
in staggered domains, which are reminiscent of their
expression pattern in bilaterians
17
. Only the anterior and
posterior Hox genes of N. vectensis show nested expres-
sion: the anterior paralogue anthox6 gene is expressed
close to the
PLANULA BLASTOPORE, and the posterior para-
logue anthox1 is expressed at the opposite end. So, the
data indicate that the oral–aboral axis of cnidarians is
homologous to the A–P axis of bilaterians. As for the sec-
ond axis, the decapentaplegic (dpp) gene (which is pivotal
for marking the dorsal side in arthropods and the ventral
side in vertebrates) is expressed asymmetrically on one
side of the blastopore in sea anemones and corals
17,55
.
This indicates that the directive axis of sea anemones
is homologous to the dorsoventral axis of bilaterians
56
.
Furthermore, the medusa stage of some cnidarians has
a well-developed striated muscle layer, which is derived
from a mesoderm-like third layer that is established at
the onset of medusa formation (the entocodon)
52
.
Therefore, if cnidarians develop through the same
process that leads to bilateral symmetry, and if the
current radially symmetrical appearance was achieved
secondarily — and if cnidarians have a tissue layer that
is homologous to the mesoderm — then the genesis of
the three clusters that are linked to the ectoderm, endo-
derm and mesoderm could still make sense. Where,
then, should we locate the origin of bilaterians and of
the three germ layers?
Sponges might hold the answer. They are classically
considered to be simple metazoans, with no obvious
symmetry. The ANTP-class homeobox complement
of sponges is currently limited to a few NK-like genes,
whereas no Hox or ParaHox gene has been found
38
. It is
tempting to place the genesis of the megacluster at the
sponge–cnidarian divergence, right at the origin of
bilaterality and triploblasty (assuming that cnidar-
ians are bilaterians and triploblastic). Still, if this rea-
soning holds true, why do sponges have several NK
genes, and no Hox or ParaHox gene? The fact that no
Hox or ParaHox gene has been identified could sim-
ply be due to the technical difficulty of finding old,
divergent or embryonically expressed sequences, or
to the choice of non-basal species. Nevertheless, the
primitiveness of sponges is currently under debate
57
.
It is unclear whether sponges form a monophyletic
sister group to the rest of the metazoans, or are para-
phyletic. As with cnidarians, evolutionary insights are
being gained from studies of embryonic development.
Sponge larvae are architecturally closer than adult
Table 2 | Collinear NK cluster gene expression in fruitflies and mice
Gene class* Tissue of expression Fruitfly Mouse
NK1 (slou) Mesoderm
Endoderm
CNS
PNS
Yes
Yes
Yes
Yes
No
No
Yes
Yes
NK3 (bap) Mesoderm Yes Yes
NK4 (tin) Mesoderm
Pharynx
Yes
Yes
Yes
Yes
Lbx (lb) Heart
CNS
PNS
Yes
Yes
Yes
Yes
Yes
Yes
Tlx (C15) Heart
CNS
PNS
Yes
Yes
Yes
Yes
Yes
Yes
*NK genes, from the originally described Drosophila melanogaster NK cluster
7
at cytological
position 93D/E, that show temporal collinear expression, and their vertebrate orthologues. Data
are from REFS 7,80.
NATURE REVIEWS
|
GENETICS VOLUME 6
|
DECEMBER 2005
|
889
FOCUS ON THE BODY PLAN
© 2005 Nature Publishing Group
NEOTENOUS
The term neoteny describes
the retention of juvenile
characteristics in the adult
of an animal species.
PLACOZOANS
Simple balloon-like marine
animals with a body cavity that
is filled with pressurized fluid.
They lack most organs and
tissues, including a nervous
system, They are either the
simplest metazoans, or
simplified forms of more
complex animals. Only a single
species (Trichoplax adhaerens)
comprises the phylum Placozoa.
CHOANOCYTES
Flagellated cells that line the
body cavity of a sponge and that
are characterized by a collar of
cytoplasm surrounding the
flagellum. They are also called
collar cells.
sponges to other metazoans. One possibility is that
other metazoans (including cnidarians) evolved from
a
NEOTENOUS larva of ancient sponges, and that sponges
were, in fact, the simplest bilaterian metazoans
58
, having
a single or only a few Hox-like genes and several NK
genes. This would imply that no basal non-bilaterian
animals currently exist. An intriguing exception might
be the
PLACOZOANS, an enigmatic group that have no
symmetry and a Hox-like gene that might resemble an
ancestral Hox or ParaHox gene
59
.
The origin of metazoans. Finally, we come to the
origin of the metazoans. Based on recent EST and
mitochondrial genome data
60
, and the similarity to
sponge
CHOANOCYTES, the last unicellular common
ancestors of the metazoans seem to have been the
choanoflagellates
61
. No ANTP-class gene has been
reported in choanoflagellates, which indicates that
animal multicellularity was linked to the origin of the
ANTP-class genes, and that descendants of the early
metazoans, which had single founder members of the
three clusters, have not survived to the present. Again,
an alternative hypothesis might be considered — that
choanoflagellates were not the ancestors of metazoans,
but degenerate sponges that lost their multicellularity
and probably, also, ANTP-class genes
58
.
Caveats. Although it is attractive, the thesis that links
the origin of the three clusters to that of the three
embryonic germ layers must be treated with caution
for many reasons. First, germ-layer specificity is far
from conclusive; these gene clusters have been evolving
independently for hundreds of millions of years, and
their present-day functions might well be a complex
composite of ancestral and derived functions. Inferring
the original function at the time of the birth of the clus-
ters is therefore hazardous. Second, the theory relies on
the idea that cnidarians were originally triploblastic. If
they were not, then the recruitment of Hox, ParaHox
and NK clusters to pattern the three germ layers must
have occurred after the genesis of the three clusters,
early in the bilaterian lineage, and not coincidentally
with their origin. Third, if cnidarians are primitively
triploblastic, we need to explain how the three clusters
became associated with particular germ layers: did
the association evolve after the cnidarian–bilaterian
divergence, or was the association lost secondarily in
cnidarians, as indicated by the endodermal expres-
sion of Hox genes
17
or the ectodermal expression of
the ParaHox gene Gsh
62
in N. vectensis? Fourth, it has
not been formally demonstrated that the original Hox,
ParaHox and NK clusters were linked together in the
genome at the time of their origin. This reconstruc-
tion is based on linkage and phylogenetic information
from extant phyla, but an extant animal with an intact
megacluster has yet to be found.
Homeobox clustering: functional constraints
Hox genes. Recent publications on a wide range of
eukaryotes indicate that gene order is not random,
and that genes with similar or coordinated expression
patterns are often clustered
4,63
. More than 25 years after
the discovery of the Hox cluster and of collinearity
64
, it
is still unclear why Hox genes are clustered and whether
clustering is needed for collinearity. ANTP-class homeo-
box clusters arose early in metazoan evolution. If there
were no functional constraints on maintaining linkage,
genes would, with time, have been scrambled over the
genome. In recent years Duboule and colleagues have
shown, using engineered mice, that temporal and
not spatial collinearity is responsible for keeping the
complex together. They have identified a global control
region that is to the 5′ end of the HoxD cluster
65
, and
an early limb control element that is to the 3′ end of
the same cluster
66
, that act as global regulators of quan-
titative HoxD gene expression in limb development.
Mechanistically, it has long been proposed that modu-
lation of chromatin structure underlies the collinear
regulation of Hox genes
3
. Recently, it has been shown
that, in cell culture as well as in mouse embryos, chro-
matin remodelling confers sequential transcriptional
competence on the cluster
67,68
.
It is becoming clear that there is a strong correlation
between possessing intact Hox clusters, having a slow
mode of development and showing temporal collinear-
ity of gene expression. Conversely, animals that develop
quickly lack temporal collinearity and have split or dis-
integrated Hox clusters
10,19,28
. Outside vertebrates, tem-
poral collinearity is seen in amphioxus, but no temporal
collinearity is observed for Hox genes in other inverte-
brate deuterostomes — which include the urochordates
Ciona intestinalis
69
and Oikopleura dioica
28
, and the
echinoderm Strongylocentrotus purpuratus
70,71
— all of
which have broken or unusual Hox clusters. Temporal
collinearity is also absent in protostomes that have
broken or disintegrating Hox clusters, which include
Drosophila lineages
72,73
, the more basal insect Bombyx
mori
74
and C. elegans
75
. A rapid and determinative type
of embryonic development might be too fast to allow
a temporal progression in the activation of Hox genes:
there is insufficient time for temporal collinearity to
occur in these animals, whereas spatial collinearity still
exists. If the constraint that maintains the close linkage in
the Hox cluster is a need to achieve temporal collinear-
ity, then once temporal collinearity is no longer needed
— as the result of developmental changes that allow fast
development — selection on the integrity of the cluster
is relaxed. According to this model the remains of the
cluster in any particular lineage are merely the result of
the phylogenetic inertia. With time these unconstrained
clusters would tend to disintegrate.
ParaHox genes. The ParaHox cluster is so far restricted
to cephalochordates and vertebrates. It shows both
spatial and temporal collinearity. However, temporal
collinearity is inverted with respect to the Hox cluster,
and the posterior Cdx genes are activated first. The
ParaHox cluster has disintegrated in the C. intestinalis
genome
19
, and O. dioica has lost the central ParaHox
gene
20
. In protostomes, D. melanogaster and C. elegans
have lost some of the ParaHox genes (D. melanogaster
lacks Xlox, and C. elegans lacks Gsh and Xlox). This
890
|
DECEMBER 2005
|
VOLUME 6 www.nature.com/reviews/genetics
REVIEWS
© 2005 Nature Publishing Group
gene loss must be secondary, as all three genes are
present in other protostomes
18
. Again, the sequential
deployment of these genes could be the constraint for
maintaining a ParaHox cluster in animals that have
slow development that is not based on strict cell-lineage
determination.
NK genes. So, the maintenance of the NK cluster in
D. melanogaster but not in chordates might be perplex-
ing: the cluster is largely conserved in D. melanogaster,
whereas it has broken up into three pieces in ver-
tebrates. The selective constraints that function on
the D. melanogaster cluster are probably related to the
sequential functional deployment of these genes during
mesodermal development, whereas this selective pres-
sure was relaxed during the evolution of deuterostome
or chordate development. Again it seems that time — the
coordinated temporal expression of neighbouring NK
genes — might be the driving force that has maintained
the close gene linkage in D. melanogaster.
Attributing the maintenance of ANTP-cluster
organization to sequential time deployment is attrac-
tive. This explanation has features in common with the
developmental switches that have been described for
the β-globin cluster
76
, and with the recently discovered
mouse Rhox homeobox cluster — this is a cluster of
12 homeobox genes that are expressed during the
development of the mouse germ cells, and for which
the timing of activation depends on their position
in the cluster
77
.
Summary and outlook
The data presented in this article allow us to propose
a model to explain the origin of Hox genes. Early in
metazoan evolution, the founder member of the
ANTP class of homeobox genes underwent a series
of cis-duplications, generating an extensive array of
homeobox genes that included the extended Hox,
ParaHox and NK clusters. The clusters followed distinct
evolutionary pathways in different lineages, with splits,
losses and dispersions around the genome, ending with
compact Hox and ParaHox clusters in vertebrates and
compact NK clusters in insects. The early steps for the
genesis of the megacluster are uncertain, and most prob-
ably occurred early in metazoan evolution — certainly
before the cnidarian–bilaterian split. It is also tempting
to imagine that the origin, diversification and increasing
complexity of the three germ layers coincided with the
generation, and expansion, of the three ANTP clusters.
The apparent paradox that the clusters appeared before
the ‘radial’ cnidarians might be explained if, as new
molecular data indicate, cnidarians were primitively
bilateral and triploblastic.
The driving force that maintains the close linkage
is probably temporal collinearity or the sequential
functional deployment of genes. When the temporal
collinearity of Hox and ParaHox genes is lost owing to
a rapid mode of embryonic development, the selective
pressure that maintains linkage is no longer active and
phylogenetic inertia in each lineage tends to split up
the cluster.
Curiously, the main invertebrate model systems such
as D. melanogaster, C. elegans or C. intestinalis have been
chosen owing to their short life cycle, small size, rapid
development and ease of culture. In the light of the
arguments outlined here, it is not surprising that gene
databanks record more broken Hox and ParaHox clus-
ters than intact ones. Evolutionary and developmental
biologists are searching for an extant animal that has an
intact megacluster, and another animal (or perhaps a
protozoan) with a single ANTP-class gene, which would
allow them to demonstrate the phylogenetic reconstruc-
tion that is suggested here. Sponges, cnidarians and basal
bilaterians are good candidates. Nevertheless, if cluster
disintegration and fast and derived development are cor-
related, those models should be chosen on the basis of
their evolutionary and developmental properties, rather
than on their ease of handling.
Approaching the enigma — why temporal coordi-
nated expression needs clustering — is a great challenge.
The answer will probably be found by researching chro-
matin structure and remodelling, histone modification,
DNA domain architecture, and intranuclear localiza-
tion of given regions of DNA. Even so, the selective
pressure to maintain gene clustering could be too
subtle to be detected by experimental methods, and
often changing a gene from its position out of the clus-
ter does not markedly change its pattern of expression.
Small differences in fitness might be detected only by
population genetics, for example, by studying link-
age disequilibrium in the cluster. In the words of the
geneticist who pioneered the study of homeotic genes,
Ed Lewis, “We know the fly looks OK with the gene
moved, but we do not know if she goes to the theatre.”
1. McGinnis, W. & Krumlauf, R. Homeobox genes and axial
patterning. Cell 68, 283–302 (1992).
2. Duboule, D. Temporal collinearity and the phylotypic
progression: a basis for the stability of a vertebrate
Bauplan and the evolution of morphologies through
heterochrony. Development Suppl. 135–142
(1994).
3. Kmita, M. & Duboule, D. Organizing axes in time and
space: 25 years of collinear tinkering. Science 301,
331–333 (2003).
4. Hurst, L. D., Pál, C. & Lercher, M. J. The evolutionary
dynamics of eukaryotic gene order. Nature Rev. Genet. 5,
299–310 (2004).
5. Brooke, N. M., Garcia-Fernàndez, J. & Holland, P. W. H.
The ParaHox gene cluster is an evolutionary sister of the
Hox gene cluster. Nature 392, 920–922 (1998).
6. Holland, L. Z. Non-neural ectoderm is really neural: evolution
of developmental patterning mechanisms in the non-neural
ectoderm of chordates and the problem of sensory cell
homologies. J. Exp. Zool. B 304B, 1–20 (2005).
7. Jagla, K., Bellard, M. & Frasch, M. A cluster of Drosophila
homeobox genes involved in mesoderm differentiation
programs. Bioessays 23, 125–133 (2001).
This paper presents a complete description of linkage
information and expression data for the NK cluster
genes in D. melanogaster, the organism in which
temporal collinearity of the NK cluster genes in the
mesoderm was first noticed.
8. Martínez, P. & Amemiya, C. T. Genomics of the HOX gene
cluster. Comp. Biochem. Physiol. B 133, 571–580 (2002).
9. Prince, E. The Hox paradox: more complex(s) than
imagined. Dev. Biol. 249, 1–15 (2002).
10. Ferrier, D. E. K. & Minguillón, C. Evolution of the Hox/
ParaHox gene clusters. Int. J. Dev. Biol. 47, 605–611 (2003).
11. de Rosa, R. et al. Hox genes in brachiopods and priapulids
and protostome evolution. Nature 399, 772–776 (1999).
12. Powers, T. P. & Amemiya, C. T. Evidence for a Hox14
paralog group in vertebrates. Curr. Biol. 14, R183–R184
(2004).
13. Minguillón C. et al. No more than 14: the end of the
amphioxus Hox cluster. Int. J. Biol. Sci. 1, 19–23 (2005).
14. Hooman, K., Moghadam, H. K., Ferguson, M. M. &
Danzmann, R. G. Organization of Hox clusters in rainbow
trout (Oncorhynchus mykiss): a tetraploid model species.
J. Mol. Evol. 30 Sep 2005 (doi:10.1007/s00239-004-
0338-7).
15. Cook, C. E., Jiménez, E., Akam, M. & Saló, E. The Hox
gene complement of acoel flatworms, a basal bilaterian
clade. Evol. Dev. 6, 154–163 (2004).
NATURE REVIEWS
|
GENETICS VOLUME 6
|
DECEMBER 2005
|
891
FOCUS ON THE BODY PLAN
© 2005 Nature Publishing Group
16. Baguñà, J. & Riutort, M. The dawn of bilaterian animals. The
case of acoelomorph flatworms. Bioessays 26, 1046–1057
(2004).
17. Finnerty, J. R., Pang, K., Burton, P., Paulson, D. &
Martindale, M. Q. Origins of bilateral symmetry: Hox and
dpp expression in a sea anemone. Science 304,
1335–1337 (2004).
18. Ferrier, D. E. K. & Holland, P. W. H. Sipunculan ParaHox
genes. Evol. Dev. 3, 263–270 (2001).
19. Ferrier. D. E. K. & Holland, P. W. H. Ciona intestinalis
ParaHox genes: evolution of Hox/ParaHox cluster integrity,
developmental mode, and temporal colinearity. Mol.
Phylogenet. Evol. 24, 412–417 (2001).
20. Edvardsen, R. B. et al. Remodelling of the homeobox gene
complement in the tunicate Oikopleura dioica. Curr. Biol. 15,
R12–R13 (2005).
21. Finnerty, J. R. & Martindale, M. Q. Ancient origins of axial
patterning genes: Hox genes and ParaHox genes in the
Cnidaria. Evol. Dev. 1, 16–23 (1999).
22. Ferrier, D. E. K. & Holland, P. W. H. Ancient origin of
the Hox gene cluster. Nature Rev. Genet. 2, 33–38 (2001).
23. Dush. M. K. & Martin, G. R. Analysis of mouse Evx genes:
Evx-1 displays graded expression in the primitive streak.
Dev. Biol. 151, 273–287 (1992).
24. Miller, D. J. & Miles, A. Homeobox genes and the zootype.
Nature 365, 215–216 (1993).
25. Banerjee-Basu, S. & Baxevanis, A. D. Molecular evolution of
the homeodomain family of transcription factors. Nucleic
Acids Res. 29, 3258–3269 (2001).
26. Minguillón. C. & Garcia-Fernàndez, J. Genesis and evolution
of the Evx and Mox genes and the extended Hox and
ParaHox gene clusters. Genome Biol. 4, R12 (2003).
27. Garcia-Fernàndez, J. Hox, ParaHox, ProtoHox: facts and
guesses. Heredity 94, 145–152 (2005).
28. Seo, H. C. et al. (2004). Hox cluster disintegration with
persistent anteroposterior order of expression in Oikopleura
dioica. Nature
431, 67–71 (2004).
This paper describes the Hox complement of the
Urochordate Oikopleura dioica, an animal that has a
compact genome, and shows that the Hox cluster is
dispersed across the genome. Although no Hox gene
is linked to any other (in that they are at least 250 kb
apart), there is still evidence for some residual spatial
collinearity of expression.
29. Matsui, T., Hirai, M., Hirano, M. & Kurosawa, Y. The HOX
complex neighbored by EVX gene, as well as two other
homoebox-containing genes, the GBX-class and the
EN-class, are located on the same chormosomes 2 and 7.
FEBS Lett. 336, 107–110 (1993).
30. Pollard, S. & Holland, P. W. H. Evidence for 14 homeobox
gene clusters in human genome ancestry. Curr. Biol. 10,
1059–1062 (2000).
31. Castro, L. F. & Holland, P. W. H. Chromosomal mapping of
ANTP class homeobox genes in amphioxus: piecing
together ancestral genomes. Evol. Dev. 5, 459–465 (2003).
The authors analyse the linkage of Hox, ParaHox and
NK genes in the cephalochordate amphioxus, and
confirm that the pre-duplicative amphioxus genome
contains the three main blocks of genes, which are
similar in organization to the ancestral chordate
condition that has been deduced from the human
genome.
32. Kim, Y. & Nirenberg, M. Drosophila NK-homeobox genes.
Proc. Natl Acad. Sci. USA 86, 7716–7720 (1989).
33. Jagla, K., et al. ladybird, a tandem of homeobox genes that
maintain late wingless expression in terminal and dorsal
epidermis of the Drosophila embryo. Development 124,
91–100 (1997).
34. Luke, G. N. et al. Dispersal of NK homeobox gene clusters
in amphioxus and humans. Proc. Natl Acad. Sci. USA 100,
5292–5295 (2003).
This study shows that the composition of the NK
gene cluster in the pre-duplicative amphioxus
genome is broken at the same positions as in the
vertebrate clusters. The authors conclude that
the constraints to maintain linkage in chordates have
been relaxed, whereas a strong functional constraint
must be present in D. melanogaster.
35. Gauchat, D. et al. Evolution of Antp-class genes and
differential expression of Hydra Hox/ParaHox genes in
anterior patterning. Proc. Natl Acad. Sci. USA 97,
4493–4498 (2000).
36. Manuel, M. & LeParco, Y. Homeobox gene diversification in
the calcareous sponge Sycon raphanus. Mol. Phyl. Evol
.
17, 97–107 (2000).
37. Coutinho, C., Fonseca, R. N., Mansure, J. J. & Borojevic, R.
Early steps in the evolution of multicellularity: deep structural
and functional homologies among homeobox genes in
sponges and higher metazoans. Mech. Dev. 120, 429–440
(2003).
38. Hill, A., Tetrault, J. & Hill, M. Isolation and expression
analysis of a poriferan Antp-class Bar-/Bsh-like homeobox
gene. Dev. Genes. Evol. 214, 515–523 (2004).
This article includes a recent update on and the
phylogenetic analyses of all known ANTP-class
homeobox genes in sponges.
39. Lowe, C. J. et al. Anteroposterior patterning in
hemichordates and the origins of the chordate nervous
system. Cell 113, 853–865 (2003).
40. Freund, J. N, Domon-Dell, C., Kedinger, M. & Duluc, I. The
Cdx-1 and Cdx-2 homeobox genes in the intestine.
Biochem. Cell Biol. 76, 957–969 (1998).
41. Moreno, E. & Morata, G. Caudal is the Hox gene that
specifies the most posterior Drosophila segment. Nature
26, 873–877 (1999).
42. Offield, M. F. et al. PDX-1 is required for pancreatic
outgrowth and differentiation of the rostral duodenum.
Development 122, 983–995 (1996).
43. Wysocka-Diller, J., Aisemberg, G. O. & Macagmo, E. R.
A novel homeobox cluster expressed in repeated structures
of the midgut. Dev. Biol. 171, 439–447 (1995).
44. Weiss, J. B. et al. Dorsoventral patterning in the Drosophila
central nervous system: the intermediate neuroblasts
defective homeobox gene specifies intermediate column
identity. Genes Dev. 12, 3591–3602 (1998)
45. Holland, P. W. H. Beyond the Hox: how widespread
is homeobox gene clustering? J. Anat. 199, 13–23 (2001).
46. Rosanas-Urgell, A., Marfany, G. & Garcia-Fernàndez, J.
Pdx1-related homeodomain transcription factors are
distinctly expressed in adult pancreatic islets. Mol. Cell.
Endocrinol. 237, 59–66 (2005).
47. Grapin-Botton, A. & Melton, D. A. Endoderm development:
from patterning to organogenesis. Trends Genet. 16,
124–130 (2000).
48. Schafer, K., Neuhaus, P., Kruse, J. & Braun, T. The
homeobox gene Lbx1 specifies a subpopulation of cardiac
neural crest necessary for normal heart development. Circ.
Res. 10, 73–80 (2003).
49. Reecy, J. M. et al. Chicken Nkx-2.8: a novel homeobox
gene expressed in early heart progenitor cells and
pharyngeal pouch-2 and-3 endoderm. Dev. Biol. 188,
295–311 (1997).
50. Pera, E. M. & Kessel, M. Demarcation of ventral territories
by the homeobox gene NKX2.1 during early chick
development. Dev. Genes Evol. 208, 168–171 (1998).
51. Martindale. M. Q., Pang, K. & Finnerty. J. R. Investigating
the origins of triploblasty: ‘mesodermal’ expression in a
diplobastic animal, the sea anemone Nematostella
vectensis (phylum, Cnidaria; class, Anthozoa). Development
131, 2463–2474 (2004).
Based on expression data from anthozoan embryos,
the authors argue in favour of a primitively
diploblastic origin of cnidarians, and discuss the
possibility that cnidarians are simplified triploblasts.
52. Seipel, K. & Schmid, V. Evolution of striated muscle:
jellyfish and the origin of triploblasty. Dev. Biol. 282, 14–26
(2005).
Based on data from jellyfish the authors argue in
favour of a triplobastic origin of cnidarians.
53. Stephenson, T. A. The British Sea Anemones Vol. 1 (Ray
Society, London, 1928).
54. Hyman, L. H. The Invertebrates. Protozoa Through
Ctenophra (McGraw-Hill, New York (1940).
55. Hayward, D. C. et al. Localized expression of a dpp/
BMP2/4 ortholog in a coral embryo. Proc. Natl Acad. Sci.
USA 99, 8106–8111 (2002).
56. Finnerty, J. R. The origins of axial patterning in the metazoa:
how old is bilateral symmetry? Int. J. Dev. Biol. 47, 523–529
(2003).
57. Muller, W. E. et al. Bauplan of urmetazoa: basis for
genetic complexity of metazoa. Int. Rev. Cytol. 235,
53–92 (2004).
58. Maldonado, M. Choanoflagellates, choanocytes, and animal
multicellularity. Inv. Biol. 123, 1–22 (2004).
This data-rich paper summarizes the embryology of
sponges. The author provocatively suggests that
higher metazoans are derived from the complex larval
stages of sponges, and that some sort of bilaterality
is already present in sponge embryogenesis.
59. Jakob, W. et al. The Trox-2 Hox/ParaHox gene of Trichoplax
(Placozoa) marks an epithelial boundary. Dev. Genes Evol.
214, 170–175 (2004).
60. King, N., Hittinger, C. T. & Carroll, S. B. Evolution of key cell
signaling and adhesion protein families predates animal
origins. Science 301, 361–363 (2003).
61. Brooke, M. N. & Holland, P. W. H. The evolution of
multicellularity and early animal genomes. Curr. Opin. Genet.
Dev. 13, 599–603 (2003).
62. Finnerty, J. R., Paulson, D., Burton, P., Pang, K. &
Martindale, M. Q. Early evolution of a homeobox gene: the
parahox gene Gsx in the Cnidaria and the Bilateria. Evol.
Dev. 4, 331–345 (2003).
63. Oliver, B. & Mistelli, T. A non-random walk through the
genome. Genome Biol. 6, 214 (2005).
64. Lewis, E. B. A gene complex controlling segmentation in
Drosophila. Nature 276, 565–570 (1978).
65. Spitz, F., Gonzalez, F. & Duboule, D. A global control region
defines a chromosomal regulatory landscape containing the
HoxD cluster. Cell 113, 405–417 (2003).
66. Zákany, J., Kmita, M. & Duboule, D. A dual role for Hox
genes in limb anterior-posterior assymetry. Science 304,
1669–1672 (2004).
Elegantly engineered mice led Duboule’s team to
identify a further global regulator of the Hox cluster
that is involved in the embryonic development of the
limb, and to propose that there is more than one class
of collinearity.
67. Chambeyron, S. & Bickmore, W. A. Chromatin
decondensation and nuclear reorganization of the HoxB
locus upon induction of transcription. Genes Dev. 10,
1119–1130 (2004).
68. Chambeyron, S., Da Silva, N. R., Lawson, K. A. &
Bikmore, W. A. Nuclear re-organisation of the HoxB
complex during mouse embryonic development.
Development 132, 2215–2223 (2005).
69. Ikuta, T., Yoshida, N., Satoh, N. & Saiga, H. Ciona
intestinalis Hox gene cluster: its dispersed structure and
residual colinear expression in development. Proc. Natl
Acad. Sci. USA 101, 15118–15123 (2004).
70. Arenas-Mena, C., Cameron, A. R. & Davidson, E. H.
Spatial expression of the Hox complex in the indirect
development of a sea urchin. Proc. Natl Acad. Sci. USA 95,
13062–13067 (1998).
71. Cameron, R. A. et al. Unusual gene order and organization
of the sea urchin Hox cluster. J. Exp. Zool. Part B 22 Aug
2005 (doi:10.1002/jez.b.21070).
72. Lewis, E. B., Pfeiffer, B. D., Mathog, D. R. & Celniker, S. E.
Evolution of the homeobox complex in the Diptera. Curr.
Biol. 13, R587–R588 (2003).
73. Negre, B., et al. Conservation of regulatory sequences and
gene expression patterns in the disintegrating Drosophila
Hox gene complex. Genome Res. 15, 692–700 (2005).
The breakage point in distinct Drosophila species
indicates that the Hox cluster in insects is
disintegrating by phylogenetic inertia, as selective
pressure for temporal collinearity is absent. The
comparision of full Hox cluster sequences indicates
that the breakages do not disrupt individual
enhancers, which are generally located at the 5′ end
of each Hox gene.
74. Yasukochi, Y. et al. Organization of the Hox gene cluster of
the silkworm, Bombyx mori: a split of the Hox cluster in a
non-Drosophila insect. Dev. Genes Evol. 21, 606–614
(2004).
75. Aboobaker, A. A. & Blaxter, M. L. Hox gene loss during
dynamic evolution of the nematode cluster. Curr. Biol. 13,
37–40 (2003).
76. Spitz, F. & Duboule, D. Reproduction in clusters. Nature
434, 715–716 (2005).
77. MacLean, J. A., et al. Rhox: a new homeobox gene cluster.
Cell 120, 369–382 (2005).
78. Gilbert S. F. Opening Darwin’s black box: teaching evolution
through developmental genetics: Nature Rev. Genet. 4,
735–741 (2003).
79. Holland, N. D. Early cental nervous system evolution: an era
of skin brains? Nature Rev. Neurosci. 4, 617–627 (2003).
80. Luke, G. N. The Amphioxus NK Cluster. Thesis, Univ.
Reading, UK (2004).
Acknowledgements
I am very grateful to P. Holland, J. Baguñà, C. Minguillón and the
members of the Barcelona Evo-Devo group for passionate discus-
sions. I especially thank G. Luke for allowing me to use and cite his
Ph.D. Thesis, and R. Rycroft for checking the English. The author’s
research is funded by the Ministerio de Educación y Ciencia, Spain,
by the Departament d’Universitats, Recerca i Societat de la
Informació de la Generalitat de Catalunya (Distinció per la Promoció
de la Recerca Universitaria), and by the European Community’s
‘Neurogenome’ Human Potential Programme.
Competing interests statement
The author declares no competing financial interests.
Online links
DATABASES
The following terms in this article are linked online to:
Entrez Gene:
http://www.ncbi.nlm.nih.gov/entrez/query.
fcgi?db=gene
ATNP | bap | Cdx | En | Evx | Gbx | Gsh | Ipf1 | Meox | slou |
tin | UBX | vnd
FURTHER INFORMATION
Expert Protein Analysis System (ExPASy): http://kr.expasy.org
FlyBase: http://flybase.bio.indiana.edu
GenBank: www.ncbi.nlm.nih.gov/Genbank
Human Genome Organization: www.hugo-international.org
Jordi Garcia’s web page: www.ub.edu/genetica/grup4/
4garcia.htm
Access to this interactive links box is free online.
892
|
DECEMBER 2005
|
VOLUME 6 www.nature.com/reviews/genetics
REVIEWS