The genome of the sea urchin Strongylocentrotus purpuratus.
ABSTRACT We report the sequence and analysis of the 814-megabase genome of the sea urchin Strongylocentrotus purpuratus, a model for developmental and systems biology. The sequencing strategy combined whole-genome shotgun and bacterial artificial chromosome (BAC) sequences. This use of BAC clones, aided by a pooling strategy, overcame difficulties associated with high heterozygosity of the genome. The genome encodes about 23,300 genes, including many previously thought to be vertebrate innovations or known only outside the deuterostomes. This echinoderm genome provides an evolutionary outgroup for the chordates and yields insights into the evolution of deuterostomes.
Article: The Caenorhabditis elegans aryl hydrocarbon receptor, AHR-1, regulates neuronal development.[show abstract] [hide abstract]
ABSTRACT: The mammalian aryl hydrocarbon receptor (AHR) is a ligand-activated transcription factor that mediates the toxic effects of dioxins and related compounds. Dioxins have been shown to cause a range of neurological defects, but the role of AHR during normal neuronal development is not known. Here we investigate the developmental functions of ahr-1, the Caenorhabditis elegans aryl hydrocarbon receptor homolog. We show that ahr-1:GFP is expressed in a subset of neurons, and we demonstrate that animals lacking ahr-1 function have specific defects in neuronal differentiation, as evidenced by changes in gene expression, aberrant cell migration, axon branching, or supernumerary neuronal processes. In ahr-1-deficient animals, the touch receptor neuron AVM and its sister cell, the interneuron SDQR, exhibit cell and axonal migration defects. We show that dorsal migration of SDQR is mediated by UNC-6/Netrin, SAX-3/Robo, and UNC-129/TGFbeta, and this process requires the functions of both ahr-1 and its transcription factor dimerization partner aha-1. We also document a role for ahr-1 during the differentiation of the neurons that contact the pseudocoelomic fluid. In ahr-1-deficient animals, these neurons are born but they do not express the cell-type-specific markers gcy-32:GFP and npr-1:GFP at appropriate levels. Additionally, we show that ahr-1 expression is regulated by the UNC-86 transcription factor. We propose that the AHR-1 transcriptional complex acts in combination with other intrinsic and extracellular factors to direct the differentiation of distinct neuronal subtypes. These data, when considered with the neurotoxic effects of AHR-activating pollutants, support the hypothesis that AHR has an evolutionarily conserved role in neuronal development.Developmental Biology 07/2004; 270(1):64-75. · 4.07 Impact Factor
, 941 (2006);
et al.Sea Urchin Genome Sequencing Consortium,
StrongylocentrotusThe Genome of the Sea Urchin
www.sciencemag.org (this information is current as of April 5, 2007 ):
The following resources related to this article are available online at
A correction has been published for this article at:
version of this article at:
including high-resolution figures, can be found in the online
Updated information and services,
can be found at:
Supporting Online Material
, 15 of which can be accessed for free:
cites 33 articles
related to this article
A list of selected additional articles on the Science Web sites
7 articles hosted by HighWire Press; see:
This article has been
This article appears in the following
in whole or in part can be found at:
permission to reproduce
of this article or about obtaining
Information about obtaining
registered trademark of AAAS.
c 2006 by the American Association for the Advancement of Science; all rights reserved. The title SCIENCE is a
CopyrightAmerican Association for the Advancement of Science, 1200 New York Avenue NW, Washington, DC 20005.
Science (print ISSN 0036-8075; online ISSN 1095-9203) is published weekly, except the last week in December, by the
on April 5, 2007
There are important differences among the
different species: Some are more effective as
grazers than others, and they vary in their diets,
growth rates, longevities, and importance in
for well over a century (17). Elucidation of their
References and Notes
1. C. Harrold, J. S. Pearse, Echinoderm Studies 2, 137
2. L. Rogers-Bennett, in Edible Sea Urchins: Biology and
Ecology, J. M. Lawrence, Ed. (Elsevier, Amsterdam,
Netherlands, 2007), pp. 393–425.
3. P. K. Dayton et al., Ecol. Monogr. 62, 421 (1992).
4. A. W. Ebeling et al., Mar. Biol. 84, 287 (1985).
5. K. D. Lafferty, Ecol. Appl. 14, 1566 (2004).
6. J. S. Pearse, A. H. Hines, Mar. Ecol. Prog. Ser. 39, 275
7. J. M. Watanabe, C. Harrold, Mar. Ecol. Prog. Ser. 71, 125
8. T. A. Ebert et al., Mar. Ecol. Prog. Ser. 111, 41
9. J. B. Jackson et al., Science 293, 629 (2001).
10. P. K. Dayton et al., Ecol. Appl. 8, 309 (1998).
11. J. A. Estes, D. O. Duggins, Ecol. Monogr. 65, 75
12. M. H. Graham, Ecosystems 7, 341 (2004).
13. M. S. Foster, Hydrobiologia 192, 21 (1990).
14. M. J. Tegner, P. K. Dayton, ICES J. Mar. Sci. 57, 579 (2000).
15. L. Rogers-Bennett, J. S. Pearse, Conserv. Biol. 15, 642
16. D. Sweetnam et al., Calif. Coop. Oceanic Fish. Invest.
Rep. 46, 10 (2005).
17. T. A. Ebert, J. R. Southon, Fish. Bull. 101, 915 (2003).
18. I thank J. Watanabe for providing the photo used in
Fig. 1, L. Rogers-Bennett for sharing her manuscript with
me, V. Pearse and an anonymous reviewer for providing
comments on the manuscript, and R. A. Cameron for
inviting me to prepare it.
The Genome of the Sea Urchin
Sea Urchin Genome Sequencing Consortium*†
We report the sequence and analysis of the 814-megabase genome of the sea urchin Strongylocentrotus
purpuratus, a model for developmental and systems biology. The sequencing strategy combined
whole-genome shotgun and bacterial artificial chromosome (BAC) sequences. This use of BAC clones,
aided by a pooling strategy, overcame difficulties associated with high heterozygosity of the genome.
The genome encodes about 23,300 genes, including many previously thought to be vertebrate
innovations or known only outside the deuterostomes. This echinoderm genome provides an
evolutionary outgroup for the chordates and yields insights into the evolution of deuterostomes.
embryo as a research model system for modern
molecular, evolutionary, and cell biology. The sea
urchin is the first animal with a sequenced
genome that (i) is a free-living, motile marine in-
vertebrate; (ii) has a bilaterally organized embryo
but a radial adult body plan; (iii) has the endo-
skeleton and water vascular system found only in
echinoderms; and (iv) has a nonadaptive immune
he genome of the sea urchin was
sequenced primarily because of the re-
markable usefulness of the echinoderm
system that is unique in the enormous complexity
of its receptor repertoire. Sea urchins are re-
markably long-lived with life spans of Strongylo-
centrotid species extending to over a century [see
supporting online material (SOM)] and highly
fecund, producing millions of gametes each year;
and Strongylocentrotus purpuratus is a pivotal
component of subtidal marine ecology and an
important fishery catch in several areas of the
world, including the United States. Although a
research model in developmental biology for a
century and a half, for most of that time,few were
aware of one of the most important characteristics
significance for genomic analysis: Echinoderms
(and their sister phylum, the hemichordates) are
the closest known relatives of the chordates (Fig.
1 and SOM). A description of the echinoderm
body plan, as well as aspects of the life-style,
longevity, polymorphic gene pool, and character-
istics that make the sea urchin so valuable as a
research organism, are presented in the SOM.
The last common ancestors of the deuterosto-
mal groups at the branch points shown in Fig.
ago (Ma)], according to protein molecular phy-
logeny. Stem group echinoderms appear in the
Lower Cambrian fossil assemblages dating to 520
forms, but from their first appearance, the fossil
record illustrates certain distinctive features that
are still present: their water vascular system,
including rows of tube feet protruding through
holes in the ambulacral grooves and their calcite
endoskeleton (mainly, a certain form of CaCO3),
Fig. 1. Purple sea urchins (S. purpuratus) grazing on the remains of a giant kelp hold fast after an
unusually heavy recruitment in Carmel Bay, California (7). [Photo courtesy of J. M. Watanabe]
*Correspondence should be addressed to George M.
Weinstock. E-mail: email@example.com
†All authors with their contributions and affiliations appear
at the end of this paper.
VOL 31410 NOVEMBER 2006
CORRECTED 9 FEBRUARY 2007; SEE LAST PAGE
on April 5, 2007
which displays the specific three-dimensional
structure known as “stereom.” The species se-
quenced, Strongylocentrotus purpuratus, com-
monly known as the “California purple sea
urchin” is a representative of the thin-spined
“modern” group of regularly developing sea
urchins (euechinoids). These evolved to become
the dominant echinoid form after the great
We present here a description of the S.
purpuratus genome and gene products. The
genome provides a wealth of discoveries
about the biology of the sea urchin, Echino-
dermata, and the deuterostomes. Among the
key findings are the following:
• The sea urchin is estimated to have
23,300 genes with representatives of nearly all
vertebrate gene families, although often the
families are not as large as in vertebrates.
•Some genes thought to be vertebrate-specific
were found in the sea urchin (deuterostome-
specific); others were identified in sea urchin but
not the chordate lineage, which suggests loss in
• Expansion of some gene families oc-
curred apparently independently in the sea
urchin and vertebrates.
• The sea urchin has a diverse and so-
phisticated immune system mediated by an
astonishingly large repertoire of innate patho-
gen recognition proteins.
•An extensive defensome was identified.
•The sea urchin has orthologs of genes asso-
ciated with vision, hearing, balance, and chemo-
sensation in vertebrates, which suggests hitherto
unknown sensory capabilities.
•Distinct genes for biomineralization exist
in the sea urchin and vertebrates.
genes were found in the sea urchin.
Sequencing and Annotation of the
S. purpuratus Genome
Sequencing and assembly. Sperm from a single
male was used to prepare DNA for all libraries
(tables S1 and S2) and whole-genome shotgun
(WGS) sequencing. The overall approach was
based on the “combined strategy” used for the rat
genome (1), where WGS sequencing to six times
coverage was combined with two times sequence
(MTP) (fig. S1). The use of BACs provided a
framework for localizing the assembly process,
which aided in the assembly of repeated sequences
and solved problems associated with the high het-
erozygosity of the sea urchin genome, without our
resorting to extremely high coverage sequencing.
Several different assemblies were produced
during the course of the project (see SOM for
details). The Sea Urchin Genome Project (SUGP)
was the first to produce both intermediate WGS
assemblies and a final combined assembly. This
was especially useful, not only for the early
availability of an assembly for analysis, but also
because WGS contigs were used to fill gaps be-
WGS assembly was produced (v 0.5 GenBank
accession number range AAGJ01000001 to
AAGJ01320773; also referred to as NCBI
build 1.1) and released in April 2005. The
final combined BAC-WGS assembly was
released in July 2006 as version (v) 2.1 and
submitted to GenBank (accession number
range AAGJ02000001 to AAGJ02220581).
A second innovation in the SUGP was the use
of the clone-array pooled shotgun sequencing
(CAPSS) strategy (2) for BAC sequencing (fig.
than prepare separate random libraries from each
of these, the CAPSS strategy involved BAC
shotgun sequencing from pools of clones and
then deconvoluting the reads to the individual
BACs. This allowed the BAC sequencing to be
performed in 1/5th the time and at 1/10th the cost.
The principal new challenge in the SUGP was
the high heterozygosity in the outbred animal that
was sequenced. It was known that single-copy
DNA in the sea urchin varied by as much as 4 to
5% [single nucleotide polymorphism (SNP) plus
insertion/deletion (indel)], which is much greater
than human (∼0.5%) (3). Moreover, alignment of
WGS reads to the early v 0.1 WGS assembly
comparable frequency of indel variants. This
average frequency of a mismatch per 50 bases or
higher prevented merging by the assembly module
in Atlas, the Phrap assembler, and also made it
difficult to determine if reads were from duplicated
but diverged sections of the genome or heterozy-
gous homologs. This challenge was met by adding
components to Atlas to handle local regions of
heterozygosity and to take advantage of the BAC
data, because each BAC sequence represented a
single haplotype (see SOM). High heterozygosity
has been seen in the past with the Ciona genomes
(4, 5) and is likely to be the norm in the future as
fewer inbred organisms are sequenced. Moreover,
the CAPSS approach makes BAC sequencing
more manageable for large genomes. Thus, the
sea urchin project may serve as a paradigm for
future difficult endeavors.
WGS sequence generated a high-quality draft with
the genome while sequencing to a level of 8×base
in good agreement with the previous estimate of
genome size, 800 Mb ± 5% (6). The assembly is a
mosaic of the two haplotypes, but it was possible to
many mismatches neighboring BACs had in their
overlap regions. This information will be used to
create a future version of the genome in which the
individual haplotypes are resolved.
Gene predictions. The v 0.5 WGS assembly
displayed sufficient sequence continuity (a contig
N50 of 9.1 kb) and higher-order organization (a
even while the BAC component was being
sequenced. We generated an official gene set
(OGS), consisting of ~28,900 gene models, by
merging four different sets of gene predictions
with the GLEAN program (7) (see SOM for
both v 0.5 and v 2.0 assemblies.
To estimate the number of genes in the S.
purpuratus genome, we began with the 28,900
redundancy found by mapping to the v 2.0 as-
sembly, then increased it by a few percent for the
new genes observed in the Ensembl set from the
v 2.0 assembly compared with v 0.5. From man-
ual analysis of well-characterized gene sets (e.g.,
ciliary, cell cycle control, and RNA metabolism
another 25% of the genes in the OGS were
fragments, pseudogenes, or otherwise not valid.
Finally, whole-genome tiling microarray analysis
(see below) showed 10% of the transcriptionally
active regions (long open reading frames, not small
RNAs) were not represented by genes in the OGS.
Taken together, this analysis gave an estimate of
about 23,300 genes for S. purpuratus. Information
on all annotated genes can be found at (8).
The overall trends in gene structure were
similar to those seen in the human genome. The
length was 7.7 kb with an average primary
transcript length of 8.9 kb. A broad distribution
of all exon lengths peaked at around 100 to 115
nucleotides, whereas that for introns at around 750
nucleotides. The smaller average intron size
relative to humans’ was consistent with the trend
that intron size is correlated with genome size.
Annotation process. Manual annotation and
analysis of the OGS was performed by a group of
over 200 international volunteers, primarily from
to centralize the annotation efforts, an annotation
database and a shared Web browser, Genboree (9),
were established at the BCM-HGSC. These tools
enabled integrated and collaborative analysis of
both precomputed and experimental information
(see SOM). Avariety of precomputed information
for each predicted gene model was made available
to the annotators in the browser, including ex-
pressed sequence tag (EST) data, the four un-
merged gene prediction sets, and transcription data
from whole-genome tiling microarray with embry-
onic RNA (see below) (10). Additional resources
available to the community are listed in table S4.
by the consortium with 159 novel models (gene
models not represented in the OGS) added to the
10 NOVEMBER 2006 VOL 314
Sea Urchin Genome
on April 5, 2007
models, the number of novel models added may
imply that the official set contains >98% of the
Genome features. A window on the genetic
landscape is scaffold-centric in S. purpuratus,
because linkage and cytogenetic maps are not
available. The 36.9% GC content of the genome is
uniformly low because assessment of the average
GC content by domains is consistent (36.8%), and
the distribution is tight (see SOM). Genes from the
OGS show no tendency to occupy regions of
higher- or lower-than-average GC content. In fact,
nearly all genes lie in regions of 35 to 39% GC.
The Echinoderm Genome in the Context
of Metazoan Evolution
The sea urchin genetic tool kit lends evolutionary
perspective to the gene catalogs that characterize
the superclades of the bilaterian animals. The
distribution of highly conserved protein domains
and sequence motifs provides a view of the ex-
pansion and contraction of gene families, as well
as an insight into changes in protein function.
Examples are enumerated in Table 1, which
presents a global overview of gene variety
obtained by comparing sequences identified in
Interpro, and Table 2, which shows the distribution
of specific Pfam database domains associated with
selected aspects of cell physiology, including
sequences identified in the cnidarian Nematostella
vectensis (11). The Interpro data suggest that about
one-third of the 50 most prevalent domains in the
sea urchin gene models are not in the 50 most
abundant families in the other representative
genomes (mouse, tunicate, fruit fly, and nema-
tode), and thus, they constitute expansions that are
specific at least to sea urchins, if not to the
complex of echinoderms and hemichordates. Two
of the most abundant domains make up 3% of the
total and mark genes that are involved in the
innate immune response. Others define proteins
associated with apoptosis and cell death regulation,
as well as proteins that serve as downstream
effectors in the Toll–interleukin 1 (IL-1) receptor
(TIR) cascade. The quinoprotein amine de-
hydrogenase domain seen in the sea urchin set is
10 times as abundant as in other representative
genomes and may be used in the systems of
quinone-containing pigments known to occur in
these marine animals. The large number of nu-
cleosomal histone domains found agrees with the
long-established sea urchin–specific expansion of
histone genes. In summary, the distribution of
protein families, rather than frequent gene innova-
tion or loss. Gene family sizes in the sea urchin are
more closely correlated with what is seen in
Of equal interest are the sorts of proteins not
found in sea urchins. The sea urchin gene set
shares with other bilaterian gene models about
4000 domains, whereas 1375 domains from other
bilaterian genomes are not found in the sea urchin
set. In agreement with the lack of morphological
evidence of gap junctions in sea urchins, there are
no gap junction proteins (connexins, pannexins,
and innexins). Also missing are several protein
domains unique to insects, such as insect cuti-
cle protein, chitin-binding protein, and several
pheromone- or odorant-binding proteins, as well
as a vertebrate invention—the Krüppel-associated
box or KRAB domain, a repressor domain in zinc
finger transcription factors (12). Finally, searches
for specific subfamilies of G protein–coupled
Table 1. Unique aspects of gene family distribution in sea urchin: Selected
shows the name given to the domain or motif family in the database. Species
abbreviations: Sp, Strongylocentrotus purpuratus; Mm, Mus musculus; Ci, Ciona
intestinalis; Dm, Drosophila melanogaster; Ce, Caenorhabditis elegans.
Species, total number (percentage of total matches)
NACHT nucleoside triphosphatase
Quinoprotein amine dehydrogenase, b chain–like
Fig. 1. The phylogenetic position of the sea urchin relative to other model systems and humans. The chordates are shown on the darker blue background
overlapping the deuterostomes as a whole on a lighter blue background. Organisms for which genome projects have been initiated or finished are shown
across the top.
VOL 31410 NOVEMBER 2006
on April 5, 2007
receptors (GPCRs) that are known as chemo-
sensory and/or odorant receptors in distinct bilat-
erian phyla failed to detect clear representatives in
the sea urchin genome. However, this failure more
likely reflects the independent evolution of these
receptors, rather than a lack of chemoreceptive
molecules,becausethe sea urchingenomeencodes
close to 900 GPCRs of the same superfamily
(rhodopsin-type GPCRs), several of which are
way to compare gene sets is to count the strict
orthologs that give reciprocal BLAST matches.
Genes that are genuine orthologs are likely to yield
each other as a best hit. Comparison of sea urchin,
fruit fly, nematode, ascidian, mouse, and human
genesets (Fig.2)indicatesthat the greatestnumber
of reciprocal best matches is observed between
mouse and human, which reflects their close rela-
tion. The numbers of presumed orthologous
genes between the ascidian and the two mam-
The difference is consistent with the lower gene
number and reduced genome size in the uro-
The number of reciprocal pairs for sea urchin
and mouse is about 1.5 times the matches between
proteins in sea urchin and fruit fly. The number of
nematode proteins matching either sea urchin or
fruit fly is even lower. This is likely the result of
the more rapid sequence changes in the nematode
compared with the other species used in this
analysis. More than 75% of the genes that are
shared by sea urchin and fruit fly are also shared
between sea urchin and mouse. Thus, these genes
constitute a set of genes common to the bilaterians,
whereas the additional sea urchin–mouse pairs are
unique to the deuterostomes.
The sea urchin genome consequently pro-
vides evidence for the now extremely robust
concept of the deuterostome superclade. A 1908
concept that originated in the form of embryos
of dissimilar species (14) is demonstrated by
In the 1980s, the sea urchin embryo became the
focus of cis-regulatory analyses of embryonic gene
expression, and there was a great expansion of
molecular explorations of the developmental cell
biology, signaling interactions, and regulatory
control systems of the embryo. Analysis of the
entire genome facilitated the first large-scale
correlation of the gene regulatory network for
development, which represents the genomic
control circuitry for specification of the endo-
the encoded potential of the sea urchin.
The embryo transcriptome and regulome.
Because of indirect development in the sea urchin,
plan formation, in developmental process and in
time, and therefore, it is possible to estimate the
genetic repertoire specifically required for forma-
tion of a simple embryo (10). Pooled mRNA
preparations from four stages of development, up
to the mid-late gastrula stage (48 hours), were
hybridized with a whole-genome tiling array.
Expression of about 12,000 to 13,000 genes, as
period, indicating that ~52% of the entire protein-
Table 2. Distribution among sequenced animal genomes of various Pfam domains associated with
selected aspects of eukaryotic cell physiology. In S. purpuratus, the number of annotated genes is
listed; the number in parentheses is the total number of models (including ones that were not
annotated) predicted to contain the Pfam domain. For Nematostella veciensis (Nv), numbers were
obtained by searching Stellabase (11).
Complexity intermediate between that in vertebrates and protostome invertebrate
Complexity greater than that found in other model organisms
Complexity lower than that found in other model organisms
*Numbers of histone genes refer to distinct core or linker histone genes, as opposed to total gene number as a result of large
tandemly repeated arrays (e.g., ~400 clusters of early histone arrays in sea urchin, 100 copies of a tandem array in Drosophila,
with each array containing a gene for the four core and one H1 histone).†Numbers for Hs, Dm, and Ce obtained from (53).
Fig. 2. Orthologs among the Bilateria. The num-
ber of 1:1 orthologs captured by BLAST align-
ments at a match value of e = 1 × 10–6in
comparisons of sequenced genomes among the
Bilateria. The number of orthologs is indicated in
the boxes along the arrows, and the total number
is shown under the species symbol. Hs, Homo
Sp, S. purpuratus; Dm, Drosophila melanogaster;
Ce, Caenorhabditis elegans.
10 NOVEMBER 2006 VOL 314
Sea Urchin Genome
on April 5, 2007
coding capacity of the sea urchin genome is
expressed during development to the mid-late
gastrula stage. An additional set of microarray
experiments extended the interrogation of embry-
onic expression to the 3-day pluteus larva stage
(see SOM) (18).
The DNA binding domains of transcription
factor families are conserved across the Bilateria,
and these protein domain motifs were used to
extract the sea urchin homologs (see SOM). For
each identified gene, if data were not already
available, probes were built from the genome
sequence and used to measure transcript concen-
tration by quantitative polymerase chain reaction
determine spatial expression by whole-mount in
All bilaterian transcription factor families
were represented in the sea urchin with a few
rare exceptions (see below), so the sea urchin
data strongly substantiate the concept of a pan-
bilaterian regulatory tool kit (19) or “regulome.”
We found that 80% of the whole sea urchin
regulome (except the zinc finger genes) was
expressed by 48 hours of embryogenesis (20), an
even greater genetic investment than the 52%
total gene use in the same embryo.
genes involved in signal transduction were identi-
fied. Comparative analysis highlights include the
protein kinases that mediate the majority of
signaling and coordination of complex pathways
in eukaryotes. The S. purpuratus genome has 353
protein kinases, intermediate between the core
conserved sets of ~230. Fine-scale classification
and comparison with annotated kinomes (21, 22)
reveals a remarkable parsimony. Indeed, with only
urchin has members of 97% of the human kinase
subfamilies, lacking just four of those subfamilies
(Axl, FastK, H11, and NKF3), whereas Drosoph-
ila lacks 20 and nematodes 32 (Fig. 3) (23). Most
sea urchin kinase subfamilies have just a single
member, although many are expanded in verte-
brates; thus, the sea urchin kinome is largely
nonredundant. The sea urchin therefore possesses
vertebrates without the complexity. A small
number of kinases were more similar to insect
than to vertebrate homologs (including the Titin
homolog Projection, the Syk-like tyrosine kinase
Shark, and several guanylate cyclases), which
in vertebrates (23). Expression profiling showed
that 87% of the signaling kinases and 80% of the
91 phosphatases were expressed in the embryo
(23, 24), which emphasized the importance of
signaling pathways in embryonic development.
The small guanosine triphosphatases
(GTPases) function as molecular switches in
signal transduction, nuclear import and export,
GTPase families were expanded after their di-
vergence from echinoderms, in part by whole-
genome duplications (25–27). The sea urchin
genome did not undergo a whole-genome
families (Ras, Rho, Rab, and Arf) revealed that
local gene duplications occurred (Fig. 4), which
ultimately resulted in a comparable number of
genomes (28). Thus, expansion of each family in
vertebrates and echinoderms was achieved by
distinct mechanisms (gene-specific versus whole
genome duplication). More than 90% of the
small GTPases are expressed during sea urchin
embryogenesis, which suggests that the
complexity of signaling through GTPases is
comparable between sea urchins and vertebrates.
plays a central role in specification and patterning
during embryonic development. Phylogenetic
analyses from cnidarian to human indicate that of
the 13 known Wnt subfamilies, S. purpuratus has
absent from deuterostomes (29). Of 126 genes
described as components of the Wnt signal
transduction machinery, homologs of ~90% were
high level of conservation of all three Wnt path-
ways (30). However, of 94 Wnt transcriptional
target genes reported in the literature, mostly from
vertebrates (31), only 53% were found with high
confidence in the sea urchin genome (Fig. 6). The
absent Wnt targets include vertebrate adhesion
molecules, which were frequently missing from
the sea urchin genome (32), as well as signaling
receptors, which are more divergent and thus more
difficult to identify. In contrast, most transcription
factortargetsof the Wntpathway are presentinthe
genome, which reflects a higher degree of conser-
vation of transcription factor families (20). Taken
together, the genomic analysis of signal transduc-
tion components indicates that sea urchins have
signaling machinery strikingly comparable to that
of vertebrates, often without the complexity that
arises from genetic redundancy.
Sea Urchin Biology
Analysis of the genome allows understanding of
parts of the organism that have not been
well studied. Several examples of this
follow with further details in the SOM.
Additional areas such as intermediary
ture, fertilization, and germline specifica-
tion are presented in the SOM.
The need to deal with physical, chemical,
and biological challenges in the environ-
ment underlies the evolution of an array of
defense gene families and pathways. One
set of protective mechanisms involves the
such as pathogens. A second group of genes
comprises a chemical “defensome,” a network of
stress-sensing transcription factors and defense
proteins that transform and eliminate many
potentially toxic chemicals.
has a greatly expanded innate immunity repertoire
compared with any other animal studied to date
are particularly increased (Fig. 7). These make up
larly large family of genes that encode NACHT
(NLRs), and a set of genes encoding multiple
scavenger receptor cysteine-rich (SRCR) domain
proteins of a class highly expressed in the sea
urchin immune cells or coelomocytes (33, 34).
Receptors from each of these families participate
in immunity by recognizing nonself molecules
that are conserved in pathogens or by responding
to self molecules that indicate the presence of
infection (35). In contrast, homologs of signal
transduction proteins and nuclear factor kappa B
(NFkB)/Rel domain transcription factors that are
known to function further downstream of these
genes were present in numbers similar to those in
other invertebrate species. One of the more un-
expected findings from our analysis of sea urchin
immune genes was the identification of a Rag1/2-
like gene cluster (36). The presence of this cluster,
along with other recent findings (37), suggested
the possibility that these genes had been part of
animal genomes for longer than previously
considered. Further analysis of the genomic
insights into the innate immune system and the
underpinnings of vertebrate adaptive immunity
can be found in a review in this issue (38).
The complement system. The complement
system of vertebrates is a complex array of
soluble serum proteins and cellular receptors
arranged into three activation pathways (classi-
cal, lectin, and alternative) that converge and
activate the terminal or lytic pathway. This sys-
tem opsonizes pathogenic cells for phagocytosis
and sometimes activates the terminal pathway,
which leads to pathogen destruction. An inver-
tebrate complement system was first identified
in the sea urchin [for reviews, see (39, 40)], and
+15 / -7
Fig. 3. Protein kinase evolution: Invention and loss of
stomes share 9 protein kinase subfamilies absent from C.
elegans and Drosophila, and the sea urchin has not lost
insects or nematodes. [From (23)]
VOL 31410 NOVEMBER 2006
on April 5, 2007
the analysis of the genome sequence presented a
more complete picture of this important immune
effector system. In chordates, collectins initiate
the lectin cascade through members of the
mannose-binding protein (MBP)–associated
protease (MASP)/C1r/C1s family. Several genes
encoding collectins, C1q and MBP, have been
predicted (39) and were present in the genome;
however, members of the MASP/C1r/C1s family
were not identified. There was no evidence for
the classical pathway, which links the comple-
ment cascade with immunoglobulin recognition
in jawed vertebrates. The alternative pathway is
initiated by members of the thioester protein
family, which, in the sea urchin, was somewhat
expanded with four genes. Two of the thioester
proteins, SpC3 and SpC3-2, are known to be
expressed, respectively, in coelomocytes and in
embryos and larvae. Furthermore, there were
three homologs of factor B, the second member
of the alternative pathway (41).
The terminal complement pathway in verte-
brates acts to destroy pathogens or pathogen-
infected cells with large pores called membrane
attack complexes (MACs). Twenty-eight gene
models were identified that encode MAC-
perforin domains, but none of these had the
additional domains expected for terminal com-
plement factors (C6 through C9). Instead, these
are members of a novel and very interesting
gene family with perforin-like structure. In ver-
tebrates, perforins carry out cell-killing functions
by cytotoxic lymphocytes through the formation
of small pores in the cell membranes. If the com-
plement system in the sea urchin functions
through multiple lectin and alternative pathways
in the absence of the lytic functions of the
terminal pathway, the major activity of this
system is expected to be opsonization.
Homologs of immune regulatory proteins.
Cytokines are key regulators of intercellular
communication involving immune cells, acting to
coordinate vertebrate immune systems. Genes en-
coding cytokines and their receptors often evolve
at a rapid pace, and most families are known only
from vertebrate systems. Although members of
many cytokine, chemokine, and receptor families
were not identified in the sea urchin genome, a
number of important immune signaling homologs
were present. These included members of the
tumor necrosis factor (TNF) ligand and receptor
superfamilies, an IL-1 receptor and accessory
proteins, two IL-17 receptor–like genes and 30
IL-17 family ligands, and nine macrophage
inhibitory factor (MIF)–like genes. Receptor
tyrosine kinases (RTKs) included those that bind
important growth factors that regulate cell prolifer-
ation in vertebrate hematopoietic systems. Of
particular note, from the sea urchin genome, were
two vascular endothelial growth factor (VEGF)
receptor–like genes and a Tie1/2 receptor, all of
which were expressed in adult coelomocytes.
Many of these genes are homologs of important
inflammatory regulators and growth factors in
higher vertebrates, and these sea urchin homologs
may have similar functions in regulating coelomo-
cyte differentiation and recruitment.
Representatives of nearly all subclasses of
important vertebrate hematopoietic and immune
transcription factors were present in the sea urchin
of immune transcription factors that had not been
identified previously outside of chordates, includ-
ing PU.1/SpiB/SpiC, a member of the Ets sub-
Ikaros subfamily. Transcript prevalence measure-
ments showed that PU.1, the Ikaros-like gene and
Fig. 4. PartialphylogeniesoftheRho(A)andtheRabfamilies(B)ofsmallGTPases.
numbers, resulting in a complexity comparable to vertebrates. Numbers at each
junction represent confidence values obtained via three independent phylogenetic
methods [neighbor-joining (green), maximum parsimony (blue), and Bayesian
(black)]; red stars indicate nodes retained by maximum likelihood. [From (28)]
10 NOVEMBER 2006VOL 314
Sea Urchin Genome
on April 5, 2007
Cell Leukemia (SCL) were all expressed at
substantial levels in coelomocytes (41). This was
consistent with the presence of conserved mecha-
nisms of regulating gene expression among sea
urchin coelomocytes and vertebrate blood cells.
ABC transporters. Many chemicals are
removed from cells by efflux proteins known
as ATP-binding cassette (ABC) or multidrug
efflux transporters. S. purpuratus has 65 ABC
transporter genes in the eight major subfamilies
of these genes [ABC A to H; (42)]. The ABCC
family of multidrug transporters is about 25%
larger than in other deuterostome genomes with
at least 30 genes in this family (nearly half of
the sea urchin ABC transporters), and 25 of
these 30 genes showed substantial mRNA
expression in eggs, embryos, or larvae. Much
of the expansion is in the Sp-ABCC5 and Sp-
ABCC9 families, whereas orthologs of the
vertebrate gene ABCC2 (also called MRP2)
are absent. Because the ABCC family is known
to generally transport more hydrophilic com-
pounds than other transporter families, such as
the ABCB genes, sea urchins may have in-
creased need for transport of these compounds.
ABCC efflux activity has been described in sea
urchin embryos and, consistent with the ge-
nomic expansion of the ABCC family, the
major activity in early embryos ensues from an
ABCC-like efflux mechanism.
Cytochrome P-450 monooxygenase (CYP).
Enzymes in the CYP1, CYP2, CYP3, and CYP4
families carry out oxidative biotransformation of
chemicals to more hydrophilic products. The sea
gene families 1 to 4 constitute 80% of the total,
to expand functionality in these gene families
(42). Eleven CYP1-like genes are present in the
sea urchin genome, more than twice the number
also present at greater numbers than in other
deuterostomes. In addition to the CYPs in
families 1 to 4, the sea urchin genome contains
homologs of proteins involved in developmental
patterning (CYP26), cholesterol synthesis
(CYP51), and metabolism (CYP27, CYP46).
Homologs of some CYPs with endogenous
functions in vertebrates were not found; however,
(CYP19, androgen aromatase; CYP8, prostacy-
clin synthase; CYP11, pregnenolone synthase;
CYP7, cholesterol-7a-hydroxylase). These CYP
genes in concert with additional expanded de-
fensive gene families represent a large diversifi-
cation of defense gene families by the sea urchin
relative to mammals (42).
Oxidative defense and metal-complexing
proteins. The metal-complexing proteins include
phytochelatin synthase genes. Genes for antioxi-
dant proteins include three superoxide dismutase
(SOD) genes and a gene encoding ovoperoxidase
(an unusual peroxidase with SOD-like activity),
along with one catalase, four glutathione peroxi-
dase, and at least three thioredoxin peroxidase
genes. Reactive oxygen detoxification genes may
urchins, because oxidative damage is thought to
be a major factor in aging.
Diversity and conservation in xenobiotic
signaling. The diversity of genes encoding
xenobiotic-sensing transcription factors that
regulate biotransformation enzymes and trans-
porters was similar to other invertebrate ge-
nomes, but in most cases lower than vertebrates.
Fig. 5. Survey of the Wnt family of secreted signaling molecules in selected
metazoans. Each square indicates a single Wnt gene identified either through
genome analyses or independent studies, and squares with a question mark
indicate uncertainty of the orthology. Letter X’s represent absence of members of
that subfamily in the corresponding annotated genome; empty spaces have been
left for species for which genomic databases are not yet available. [From (30)]
Fig. 6. Presence of Wnt signaling machinery components (A) and target genes (B) in the S. purpuratus
genome. (A) The 126 genes involved in the transduction of the Wnt signals have been separated into
four categories from the extracellular compartment to the nucleus. Sea urchin homologs are identified
by the lighter shade (indicated by both the number and the percentage of homologs that were
identified within the chart); the total number of known genes is indicated in the chart legend. (B) The
93 reported Wnt targets have been divided into three categories: signaling molecules, transcription
factors, and cell adhesion molecules. Colors and numbers are as in (A).
VOL 314 10 NOVEMBER 2006
on April 5, 2007
For example, the sea urchin genome encoded a
single predicted CNC-bZIP protein homologous
to the four human CNC-bZIP proteins involved in
the response to oxidative stress. There were two
sea urchin homologs of the aryl hydrocarbon
receptor (AHR), which in vertebrates mediates
the transcriptional response to polynuclear and
halogenated aromatic hydrocarbons and, in both
protostomes and deuterostomes, also regulates
specific developmental processes (43–45). One of
the sea urchin AHR homologs was more closely
related to the vertebrate AHR; the other shared
greatest sequence identity with the Drosophila
genes encoding hypoxia-inducible factors (HIFa
subunits), which regulate adaptive responses to
hypoxia, and a gene encoding ARNT, a PAS
protein that is a dimerization partner for both
AHRs and HIFs.
Strongylocentrotus purpuratus has 32 nu-
clear receptor (NR) genes (20), two-thirds the
roles in chemical defense (42). The sea urchin ge-
nome also contains two peroxisome proliferator–
activated receptor (PPAR, NR1C) homologs and
an NR1H gene coorthologous to both liver X
receptor (LXR) and farnesoid X receptor (FXR)
(42). Genes homologous to the vertebrate xeno-
biotic sensor NR1I genes [pregnane X receptor,
PXR; constitutive androstane receptor, CAR (46)]
are absent, although three NR1H-related genes
were found, which possibly form a new subfamily
of genes involved in xenobiotic sensing.
Many of the defense genes are expressed
during development (10, 42), which suggests
that they have dual roles in chemical defense
and in developmental signaling. In several
cases (CYPs, AHR, NF-E2), the evolution of
pathways for chemical defense may have
involved recruitment from developmental
signaling pathways (42).
The echinoderm nervous system is the least well
studied of all the major metazoan phyla. For a
number of technical reasons, the structure and
function of echinoderm nerves have been
neglected. Analysis of the sea urchin genome
has enabled an unprecedented glimpse into the
neural and sensory functions and has revealed
several novel molecular approaches to the study
of echinoderm nervous systems (Table 3).
The nervous systems of echinoderm larvae
and adults are dispersed, but they are not simple
nerve nets. This organization differs from both
system, and hemichordates, which do have nerve
nets (47). Adult sea urchins have thousands of
appendages, each with sensory neurons, ganglia,
and motor neurons arranged in local reflex arcs.
These peripheral appendages are connected to
each other and to radial nerves, which provide
overall control and coordina-
tion (47, 48).
Nearly all of the genes
encoding known neurogenic
transcription factors are pre-
and several are expressed in
neurogenic domains before
gastrulation, which indicates
that they may operate near
gene regulatory network (47).
Axon guidance molecules
known from other metazoans
are also expressed in the
developing embryo. Unex-
pectedly, genes encoding the
system that were thought to
are present in sea urchin,
which suggests a deutero-
stome origin and a potential
loss in urochordates.
The genes required to
construct neurons and to
transmit signals are present,
but the repertoire of neural
genes and the initial charac-
terization of expression of a
number of them led to unex-
pected and surprising conclusions. There appear
to be no genes encoding gap junction proteins,
which suggests that communication among neu-
coupling. The repertoire of sea urchin neuro-
transmitters is large, but melatonin and adrenalin
are lacking, as they are in ascidians (4, 47).
Cannabinoid, lysophospholipid, and melanocor-
tin receptors are not present in urchins, but
orthologs were found in ascidians (4, 47). In
contrast, some sets of genes thought to be
chordate-specific have sea urchin orthologs, for
example, insulin and insulin-like growth factors
(IGFs) that are more similar to their chordate
counterparts than those of other invertebrates
(47). Overall, the genome contains representa-
tives of all five large superfamilies of GPCRs,
including those that mediate signals from neuro-
and rhodopsin superfamilies display marked
lineage-specific expansions (13, 47).
Sensory systems. There were 200 to 700
putative chemosensory genes that formed large
clusters and lacked introns, which are features of
chemosensory genes in vertebrates, but not in
Caenorhabditis elegans and Drosophila mela-
nogaster. Many of these genes encoded amino
acid motifs that were characteristic of vertebrate
chemosensory and odorant receptors (13, 47).
Sea urchins had an elaborate collection of
photoreceptor genes that quite surprisingly
appeared to be expressed in tube feet (13, 47).
These included many genes encoding tran-
scription factors regulating retinal development
and a photorhodopsin gene.
affecting hearing, balance, and retinitis pigmen-
tosa (retinal photoreceptor degeneration). Most
of the genes involved have been identified, and
they encode a set of membrane and cytoskeletal
proteins that form an interacting network that
controls the arrangement of mechanosensory
stereocilia in hair cells of the mammalian ear.
Many or all of the proteins play some roles in
photoreceptor organization and/or maintenance.
Orthologs of virtually the entire set of membrane
and cytoskeletal proteins of the Usher syndrome
network were found in the sea urchin genome.
These include the very large membrane proteins,
usherin and VLGR-1 and large cadherins
(Cadh23 and possibly Pcad15), all of which
participate in forming links between stereocilia
in mammalian hair cells, as well as myosin 7 and
15, twoPDZproteins (harmonin andwhirlin) and
another adaptor protein (SANS), which partici-
pate in linking these membrane proteins to the
cytoskeleton. In addition, two membrane trans-
porters, NBC (a candidate Usher syndrome target
known to interact with harmonin) and TrpA1 (the
mechanosensory channel connected to the tip
links containing cadherin 23), have orthologs in
the sea urchin genome. Sea urchins do not have
Fig. 7. Gene families encoding important innate immune receptors
and complement factors in animals with sequenced genomes. For
some key receptor classes, gene numbers in the sea urchin exceeds
other animals by more than an order of magnitude. Representative
animals include H.s., Homo sapiens; C.i., Ciona intestinalis; S.p.
Strongylocentrotus purpuratus; D.m. Drosophila melanogaster; and
C.e. Caenorhabditis elegans. Indicated gene families include TLR, toll-
like receptors; NLR, NACHT and leucine-rich repeat (LRR) domain–
containing proteins similar to the vertebrate Nod/NALP genes; SRCR,
Scavenger receptor cysteine-rich domain genes; PGRP, peptidoglycan
recognition protein domain genes; and GNBP, Gram-negative binding
proteins. C3/4/5, thioester proteins homologous to vertebrate C3, C4,
and C5; Bf/C2, complement factors homologous to vertebrate C2 and
factor B; C1q/MBP, homologs of vertebrate lectin pathway receptors;
and Terminal pathway, homologs of vertebrate C6, C7, C8, and C9.
SRCR gene statistics are given as domain number/gene number for
all SRCR proteins). Asterisk in the D. melanogaster C3/4/5 column is
meant to denote the presence of related thioester genes (TEPs) and a
true C3/4/5 homolog from another arthropod. +/− for C. intestinalis
Terminal pathway column indicates the presence of genes with
similarity to C6 only (Nonaka and Yoshizaki 2004). Phylogenetic
relations among species are indicated by a cladogram at the left.
10 NOVEMBER 2006 VOL 314
Sea Urchin Genome
on April 5, 2007
other sensory processes. Sea urchins respond to
light, touch, and displacement and probably use
some of same sensory genes used by vertebrates.
The Echinoderm Adhesome
The S. purpuratus genome contained representa-
tives of all the standard metazoan adhesion
receptors (table S7), but the emphasis on different
classes of receptors differed substantially from that
used by vertebrates. The integrin family was
and vertebrates—several chordate-specific
expansions of the integrin repertoire were absent,
and there were some expansions unique (so far)
to echinoderms. The cadherin repertoire was also
of over a hundred), and many chordate-specific
expansions were missing. Specialized large
cadherins shared by protostomes and vertebrates
were present, as well as some specialized large
cadherins previously thought to be chordate-
specific, but overall, the cadherin repertoire was
more invertebrate than vertebrate in character.
Sea urchins lacked the integrins and cadherins
that link to intermediate filaments in vertebrates.
In contrast, sea urchins had large repertoires
of adhesion molecules containing immuno-
globulin superfamily, fibronectin type 3 repeat
(FN3), epidermal growth factor (EGF), and
LRR repeats. In addition to the expansion of
TLRs and NLRs mentioned above, there are
large expansions of other LRR receptor families,
including GPCRs (32). The key neural adhesion
systems involved in regulating axonal outgrowth
were present (netrin/Unc5/DCC; Slit/Robo; and
semaphorins/plexins), as were adhesion mole-
cules involved in synaptogenesis (Agrin/MUSK;
and neurexin/neuroligins). This was not
surprising because these molecules were known
in both protostomes and vertebrates. However,
structurally, the synapses of echinoderms are
unusual because there are no direct synaptic
contacts (49). Some of them were expressed in
sea urchin embryos before there are any neu-
rons, suggesting that they may have other roles
The basic metazoan basement membrane ex-
tracellular matrix (ECM) tool kit was present—
two alpha-IV collagen genes, perlecan, laminin
subunits, nidogen, and collagen XV/XVIII. There
did not appear to be much, if any, expansion of
these gene families, as is found in vertebrates,
which suggests that there is less diversity among
basement membranes. Quite a few ECM proteins
present in chordates, but not protostomes, were
vertebrate-type matrix proteoglycans, and com-
plex VWA/FN3 collagens among others (32).
Absence of these genes may be related to the
absences of neural crest migration, a high shear
endothelial-lined vasculature and, of course, car-
tilage and bone.
In addition to the components of Usher syn-
dromes mentioned above, it was surprising to
find a clear ortholog of reelin, a large ECM
protein involved in establishing the layered
organization of neurons in the vertebrate cerebral
associated with Norman-Roberts-type lissen-
cephaly syndrome. Reelin has a unique domain
Table 3. Genomic insights into sea urchin neurobiology.
Neural process Revelations from the genomeGenes
Neural developmentNeurogenic ectoderm is specified in early
Echinoderm synapses are structurally
unusual, despite the presence of many
genes encoding proteins involved in
Neurons have ion channel proteins, but
lack electrical coupling via gap junctions.
Neurons use the same neurotransmitters as
vertebrates, but lack melatonin and adrenalin.
Sp-Achaete-scute, Sp-homeobrain, Sp-Rx
(retinal anterior homeobox), Sp-Zic2
Sp-Neurolignin, Sp-neurexin, Sp-agrin, Sp-MUSK,
Sp-thrombospondin, Sp-Rim2, Sp-Rab3,
exocyst complex, Snares, SM, synaptotagmins
Synapse structure and function
Electrical signaling and couplingVoltage-gated K+, Ca2+, and Na+channels,
but no connexins or pannexins/innexins
Enzymes involved in synthesis, transport,
reception, and hydrolysis of serotonin,
dopamine, noradrenaline, g-aminobutryic
acid (GABA), histamine, acetycholine,
glycine, and nitric oxide
Orthologs of vertebrate cannabinoid,
lysophospholipid, and melanocortin
receptors are absent; 162 secretin receptor-
37 G protein–coupled peptide receptors.
Precursors for SALMFamides,
NGFFFamide, and a vasotocin-like peptide
GPCR signaling Identification of GPCRs that are unique to
chordates and identification of expanded
Peptide signaling G protein–coupled peptide receptors
indicate diversity in peptide signaling
systems, but only a few sea urchin
neuropeptides or peptide hormones identified.
Neurotrophins and neurotrophin receptors
are not unique to chordates.
More similar to vertebrate forms than
invertebrate insulin-like molecules.
A large family of predicted chemoreceptor
genes, some expressed in tube feet or
pedicellariae, indicates a complex
Genes associated with photoreception are
expressed in tube feet.
Orthologs of vertebrate mechanosensory
genes are present.
NeurotrophinsSp-Neurotrophin, Sp-Trk, Sp-p75NTR,
Sp-IGF1, SpIGF2 Insulin and IGFs
Chemosensory functionsOver 600 genes encoding putative G
protein–coupled chemoreceptors, many
tandemly repeated and lacking introns
Photoreception functions Photorhodopsins, Sp-Pax6, retinal
Sp-Usherin, Sp-VLGR-1, Sp-cadherins,
Sp-myosin 7, Sp-myosin 15, Sp-harmonin,
Sp-whirlin, Sp-NBC, Sp-TrpA1
VOL 31410 NOVEMBER 2006
on April 5, 2007
composition and organization (Reeler, EGF,
BNR) that has not been found outside chordates,
but the sea urchin genome included a very good
homolog of reelin. Receptors for reelin are
believed to include low-density lipoprotein
receptor–related proteins (LRPs), and there are a
number of these receptors in S. purpuratus
although it is as yet unclear whether they are
reelin receptors, lipoprotein receptors, or some-
thing else. Similar receptors are also involved in
human disease (atherosclerosis).
Among the deuterostomes, only echinoderms
and vertebrates produce extensive skeletons.
The possible evolutionary relations between
biomineralization processes in these two groups
have been controversial. Analysis of the S.
purpuratus genome revealed major differences
in the proteins that mediate biomineralization in
echinoderms and vertebrates (50). First, there
were few sea urchin counterparts of extracellular
proteins that mediate biomineral deposition in
tant class of proteins involved in biomineraliza-
tion is the family of secreted, calcium-binding
phosphoproteins, or SCPPs. Sea urchins did not
have counterparts of SCPP genes, which sup-
ports the hypothesis that this family arose via a
series of gene duplications after the echinoderm-
chordate divergence (51). Second, almost all of
the proteins that have been directly implicated in
the control of biomineralization in sea urchins
ton consists of magnesium calcite (as distinct
from the calcium phosphate skeletons of verte-
proteins. The sea urchin spicule matrix proteins
were encoded by a family of 16 genes that are
organized in small clusters and likely proliferated
by gene duplication. Counterparts of sea urchin
spicule matrix genes were not found in verte-
brates, amphioxus, or ascidians. Likewise, other
genes that have been implicated in bio-
mineralization in sea urchins, including genes
that encode the transmembrane protein P16 and
MSP130, a glycosylphosphatidylinositol-linked
glycoprotein, were members of small clusters of
closely related genes without apparent homologs
in other deuterostomes. The members of all three
of these sea urchin–specific gene families were
expressed specifically by the biomineral-forming
cells of the embryo, the primary mesenchyme
cells [see (50)]. As a whole, these findings
highlighted substantial differences in the primary
sequences of the proteins that mediate biominer-
alization in echinoderms and vertebrates.
Cytoskeletal genes. In addition to identifying
genes for all previously known S. purpuratus
actins and tubulins, one d- and two e-tubulin
genes were found (52). Newly identified motor
es of myosin, and eight more families of kinesins.
The first dynein cloned and sequenced was from
sea urchin, and although most S. purpuratus
dynein heavy chain genes mapped one-to-one to
mammalian homologs, Sp-DNAH9 mapped one-
to-three, as it was equidistant between the closely
similar mammalian genes DNAH9, DNA11, and
Our estimate of 23,300 genes is similar to es-
timates for vertebrates, despite the fact that two
whole-genome duplications are believed to have
occurred in the chordate lineage after divergence
from the lineage leading to the echinoderms
(25–27). From the analysis presented here, it
seems likely that many mechanisms shaped the
final genetic content of these genomes. On the
one hand, there are cases of gene families that
are expanded in vertebrates compared with sea
urchin, including examples of the expected 4:1
ratio from two duplications (15). However other
patterns are also found. The nuclear receptor
family is only slightly reduced in sea urchin
compared with that of humans, which suggests
gene loss followed the vertebrate duplications.
The unprecedented expansions of innate im-
mune system diversity contrast sharply with the
much smaller sets of counterparts that are
present in the sequenced genomes of proto-
stomes, Ciona, and vertebrates, an example of
independent expansion in the sea urchin,
whereas the GTPases described here have
expanded in sea urchin to about the same
numbers as in vertebrates. Thus, whereas the
duplications of the chordate lineage were a
contributor to the increased complexity of
vertebrates, regional expansions clearly play a
large role in the evolution of these animals.
The refinement of the inventory of vertebrate-
specific or protostome-specific genes likewise
benefits from the sea urchin genome. Many more
human genes have shared ancestry across the
deuterostomes, and in fact, bilaterian genes are
more broadly shared than had been inferred from
comparison of the previously limited genome
sequences. The new biological niche sampled by
the sea urchin genome provides not only a clearer
view of the deuterostome and bilaterian ancestor,
but has also provided a number of surprises. The
finding of sea urchin homologs for sensory
proteins related to vision and hearing in humans
may lead to interesting new concepts of percep-
tion, and the extraordinary organization of the sea
urchin immune system is different from any
animal yet studied. From a practical standpoint,
the sea urchin may be a treasure trove. Because of
the many pathways shared by sea urchin and
human, the sea urchin genome includes a large
number of human disease gene orthologs. Many
of the genes described in the preceding sections
fall into this category (see tables S8 and S9) and
cover a surprising diversity of systems such as
nervous, endocrine, and blood systems, as well as
muscle and skeleton, as exemplified by the
Huntington and muscular dystrophy genes.
Continued exploration of the sea urchin immune
system is expected to uncover additional varia-
tions for protection against pathogens. The im-
mense diversity of pathogen-binding motifs
encoded in the sea urchin genome provides an
invaluable resource for antimicrobial applications
and the identification of new deuterostome
immune functions with direct relevance to hu-
man health. These exciting possibilities show
that much biodiversity is yet to be uncovered by
sampling additional evolutionary branches of
the tree of life.
References and Notes
1. R. A. Gibbs et al., Nature 428, 493 (2004).
2. W. W. Cai, R. Chen, R. A. Gibbs, A. Bradley, Genome Res.
11, 1619 (2001).
3. R. J. Britten, A. Cetta, E. H. Davidson, Cell 15, 1175 (1978).
4. P. Dehal et al., Science 298, 2157 (2002).
5. J. P. Vinson et al., Genome Res. 15, 1127 (2005).
6. R. T. Hinegardner, Anal. Biochem. 39, 197 (1971).
7. C. G. Elsik, A. J. Mackey, J. T. Reese, N. V. Milshina,
D. S. Roos, G. M. Weinstock, Genome Biol,. in press.
8. Sea Urchin Genome Project (http://sugp.caltech.edu/
9. Genboree (www.genboree.org).
10. M. Samanta et al., Science 314, 960 (2006).
11. J. C. Sullivan et al., Nucleic Acids Res. 34, D495 (2006).
12. S. C. Materna, M. Howard-Ashby, R. F. Gray, E. H.
Davidson, Dev. Biol. 10.1016/j.ydbio.2006.08.032,
13. F. Raible et al., Dev. Biol., in press.
14. K. Grobben, Verh. Zool. Bot. Ges. Wien 58, 491 (1908).
15. E. H. Davidson, in Gene Regulatory Networks in
Development and Evolution (Academic Press/Elsevier,
San Diego, CA, 2006).
16. E. H. Davidson et al., Science 295, 1669 (2002).
17. E. H. Davidson et al., Dev. Biol. 246, 162 (2002).
18. Z. Wei, R. C. Angerer, L. M. Angerer, Dev. Biol. 10.1016/j.
ydbio.2006.08.034, in press.
19. D. H. Erwin, E. H. Davidson, Development 129, 3021 (2002).
20. M. Howard-Ashby, C. T. Brown, S. C. Materna, L. Chen,
E. H. Davidson, Dev. Biol., in press.
21. G. Manning, G. D. Plowman, T. Hunter, S. Sudarsanam,
Trends Biochem. Sci. 27, 514 (2002).
22. G. Manning, D. B. Whyte, R. Martinez, T. Hunter,
S. Sudarsanam, Science 298, 1912 (2002).
23. C. Bradham et al., Dev. Biol. 10.1016/j.ydbio.2006.08.
074, in press.
24. C. Byrum et al., Dev. Biol., in press.
25. P. Dehal, J. L. Boore, PLoS Biol. 3, e314 (2005).
26. X. Gu, Y. Wang, J. Gu, Nat. Genet. 31, 205 (2002).
27. A. McLysaght, K. Hokamp, K. H. Wolfe, Nat. Genet. 31,
28. W. Beane, E. Voronina, G. M. Wessel, D. R. McClay, Dev.
Biol. 10.1016/j.ydbio.2006.08.046, in press.
29. A. Kusserow et al., Nature 433, 156 (2005).
30. J. Croce et al., Dev. Biol., in press.
31. The Wnt homepage (www.stanford.edu/~rnusse/
32. C. A. Whittaker et al., Dev. Biol. 10.1016/j.ydbio.2006.
07.044, in press.
33. Z. Pancer, Proc. Natl. Acad. Sci. U.S.A. 97, 13156 (2000).
34. Z. Pancer, J. P. Rast, E. H. Davidson, Immunogenetics 49,
35. S. Akira, S. Uematsu, O. Takeuchi, Cell 124, 783 (2006).
36. S. D. Fugmann, C. Messier, L. A. Novack, R. A. Cameron,
J. P. Rast, Proc. Natl. Acad. Sci. U.S.A. 103, 3728 (2006).
37. V. V. Kapitonov, J. Jurka, PLoS Biol. 3, e181 (2005).
38. J. P. Rast et al., Science 314, 952 (2006).
39. L. C. Smith, K. Azumi, M. Nonaka, Immunopharmacology
42, 107 (1999).
10 NOVEMBER 2006VOL 314
Sea Urchin Genome
on April 5, 2007
40. L. C. Smith, L. A. Clow, D. P. Terwilliger, Immunol. Rev.
180, 16 (2001).
41. T. Hibino et al., Dev. Biol. 10.1016/j.ydbio.2006.08.065,
42. J. Goldstone et al., Dev. Biol. 10.1016/j.ydbio.2006.08.
066, in press.
43. H. Qin, J. A. Powell-Coffman, Dev. Biol. 270, 64 (2004).
44. J. A. Walisser, E. Glover, K. Pande, A. L. Liss, C. A. Bradfield,
Proc. Natl. Acad. Sci. U.S.A. 102, 17858 (2005).
45. D. M. Duncan, E. A. Burgess, I. Duncan, Genes Dev. 12,
46. W. Xie, R. M. Evans, J. Biol. Chem. 276, 37739 (2001).
47. R. D. Burke et al., Dev. Biol. 10.1016/j.ydbio.2006.08.
007, in press.
48. J. L. S. Cobb, in Nervous Systems of Invertebrates,
M. A. Ali, Ed. (Plenum, New York, 1987), pp. 483–525.
49. J. L. Cobb, V. W. Pantreath, Tissue Cell 9, 125 (1977).
50. B. T. Livingston et al., Dev. Biol. 10.1016/j.ydbio.2006.
07.047, in press.
51. K. Kawasaki, T. Suzuki, K. M. Weiss, Proc. Natl. Acad. Sci.
U.S.A. 101, 11356 (2004).
52. R. L. Morris et al., Dev. Biol. 10.1016/j.ydbio.2006.08.
052, in press.
53. E. V. Koonin, L. Aravind, Cell Death Differ. 9, 394 (2002).
54. We gratefully acknowledge the following support:
BCM-HGSC, National Human Genome Research Institute
(NIH) grant 5 U54 HG003273; Naples Workshop,
Stazione Zoologica Naples and the Network of Excellence
“Marine Genomics Europe” (GOCE-04-505403);
M. Elphick, Biotechnology and Biological Sciences
Research Council (BBSRC), UK, grant S19916; J. Rast
laboratory, Natural Sciences and Engineering Research
Council (NSERC) of Canada, Canadian Institutes of Health
Research (CIHR), and the Uehara Memorial Foundation;
J. A. Coffman, Mount Desert Island Biological
Laboratory (MDIBL), NIH grant GM070840;
M. C. Thorndyke, K. H. Wilson, F. Hallböök, R. P. Olinski,
Swedish Science Research Council, Network of Excellence
Marine Genomics Europe (GOCE-04-505403), European
Union Research Training Networks FP5 Trophic
Neurogenome HPRN-ct-2002-00263, and the Royal
Swedish Academy of Sciences, STINT; E. H. Davidson,
R. A. Cameron, the Center for Computational Regulatory
Genomics (E. H. Davidson, principal investigator) was
supported by the NIH grant RR-15044, NSF IOB-
0212869, and the Beckman Institute; also, support for
the E. H. Davidson laboratory is from NIH grants
HD-37105 and GM61005 and U.S. Department of Energy
(DOE) grant DE-FG02-03ER63584; P. Oliveri, Camilla
Chandler Frost Fellowship; G. M. Wessel laboratory
supported by NSF IOB-0620607 and NIH grant R01
HD028152; B. Brandhorst, K. Bergeron, and N. Chen,
NSERC; K. R. Foltz, NSF, IBN-0415581; M. Hahn, NIH
grant R01ES006272; D. Burgess, NIH grant GM058231;
L. C. Smith, NSF (MCB-0424235); R. O. Hynes, Howard
Hughes Medical Institute and National Cancer Institute
(NCI) (MIT Cancer Center core grant P30-CA14051);
D. McClay, NIH grants GM61464, HD039948, and
HD14483; V. D. Vacquier (group leader), G. W. Moy,
H. J. Gunaratne, M. Kinukawa, M. Nomura, A. T. Neill,
and Y.-H. Su, NIH grant R37-HD12896; R. D. Burke,
NSERC and CIHR; L. M. Angerer, National Institute of
Dental and Craniofacial Research (NIDCR), R. C. Angerer
(NIDCR), Z. Wei (NIDCR), G. Humphrey, National
Institute of Child Health and Human Development
(NICHD), M. Landrum, National Center for Biotechnology
Information (NCBI), O. Ermolaeva (NCBI), P. Kitts (NCBI),
K. Pruitt (NCBI), V. Sapojnikov (NCBI), A. Souvorov
(NCBI), W. Hiavina (NCBI), S. Fugmann, National
Institute on Aging (NIA), M. Dean, National Cancer
Institute–Frederick (NCIFCRF) Intramural Research
Program of the NIH; P. Cormier, Association pour la
Recherche contre la Cancer (ARC), France, grants 4247
and 3507 to P.C., Ligue Nationale contre le Cancer to
P.C., Conseil Régional de Bretagne and Conseil Général
du Finistère; W. H. Klein, National Eye Institute, NIH
grant EY11930, NICHD HD66219, and the Robert A.
Welch Foundation (G-0010); N. Adams, NSF grant IBN
0417003 and the Department of the Navy, Office of
Naval Research, under Award N00014-05-1-0855;
D. Epel, NSF 0417225; A. Hamdoun, F32-HD47136;
C. Byrum, American Heart Association grant 0420074Z;
K. Walton, U.S. Army Medical Research and Materiel
Command grant W81XWH-04-1-0324; J. Stegeman, NIH
2P42 ESO7381; and J. Goldstone, NIH F32 ESO12794.
Sea Urchin Genome Sequencing Consortium
Overall project leadership: Erica Sodergren,1,2George M.
Weinstock,1,2Eric H. Davidson,3R. Andrew Cameron3
Principal investigators: Richard A. Gibbs,1,2George M.
Annotation section leaders: Robert C. Angerer,4Lynne M.
Angerer,4MariaIna Arnone,5David R. Burgess,6RobertD.Burke,7
R. Andrew Cameron,3James A. Coffman,8Eric H. Davidson,3
R. Foltz,12Amro Hamdoun,13Richard O. Hynes,14William H.
Klein,15William Marzluff,16David R. McClay,17Robert L. Morris,18
Arcady Mushegian,19,20Jonathan P. Rast,21Erica Sodergren,1,2
L. CourtneySmith,23MichaelC.Thorndyke,24VictorD. Vacquier,24
George M. Weinstock,1,2Gary M. Wessel,26Greg Wray,27Lan
Annotation: Gene list: Erica Sodergren1,2(leader), George M.
Weinstock1,2(leader), Robert C. Angerer,4Lynne M. Angerer,4
R. Andrew Cameron,3Eric H. Davidson,3Christine G. Elsik,27Olga
Melissa J. Landrum,28Aaron J. Mackey,32* Donna Maglott,28
Georgia Panopoulou,33Albert J. Poustka,33Kim Pruitt,28Victor
Sapojnikov,29Xingzhi Song,1,2Alexandre Souvorov,28Victor
Solovyev,34Zheng Wei,4Charles A. Whittaker,35Kim Worley,1,2
Assembly of genome: Erica Sodergren1,2(leader), George M.
Weinstock1,2(leader), K. James Durbin,1,2Richard A. Gibbs,1,2
YufengShen1,2(v2.1),Xingzhi Song1,2(v0.5),Kim Worley,1,2Lan
Basal transcription apparatus proteins and polymerases
chromatin proteins: Greg Wray27(leader), Olivier Fedrigo,26
David Garfield,27Ralph Haygood,17Alexander Primus,26Rahul
BCM-HGSC annotation database and Genboree: Lan Zhang1,2
(leader), Erica Sodergren1,2(leader), George M. Weinstock1,2
(leader), Manuel L. Gonzalez-Garay,1,2Andrew R. Jackson,1,2
Aleksandar Milosavljevic,1,2Xingzhi Song,1,2Mark Tong,1,2Kim
Biomineralization: Charles A. Ettensohn11(leader), R. Andrew
Cameron,3Christopher E. Killian,36Melissa J. Landrum,31Brian T.
Livingston,37Fred H. Wilt36
Cell physiology: James A. Coffman8(leader), William Marzluff16
(leader), Arcady Mushegian19,20(leader), Nikki Adams,37Robert
Bertrand Cosson,38,39Jenifer Croce,17Antonio Fernandez-
Kelkar,42Julia Morales,38,39Odile Mulner-Lorillon,39,40Anthony J.
(leader), Nikki Adams,36Bryan Cole,13Michael Dean,9David
Scally,9John J. Stegeman43
Ciliogenesis and ciliary compounds:RobertL.Morris18(leader),
Erin L. Allgood,18Jonah Cool,18Kyle M. Judkins,18Shawn S.
McCafferty,18Ashlan M. Musante,18RobertA.Obar,44†AmandaP.
Rawson,18Blair J. Rossetti18
Cytoskeletal and organelle genes: David R. Burgess6(leader),
Erin L. Allgood,18Jonah Cool,18Ian R. Gibbons,45Matthew P.
Hoffman,6Kyle M. Judkins,18Andrew Leone,6Shawn S.
McCafferty,18Robert L. Morris,18Ashlan M. Musante,18Robert A.
Embryonic transcriptome: Eric H. Davidson3(leader),
R. Andrew Cameron,3Sorin Istrail,46Stefan C. Materna,3Manoj
P. Samanta,47,48Viktor Stolc,47Waraporn Tongprasit,47Qiang
Embryonic temporal expression pattern list: Robert C.
Angerer4(leader), Lynne M. Angerer4(leader), Zheng Wei4
Echinoderm adhesome: Richard O. Hynes14(leader), Karl-Frederik
Bergeron,49Bruce P. Brandhorst,50Robert D. Burke,7Charles A.
Echinoderm evolution: R. Andrew Cameron3(leader), Kevin
Berney,3David J. Bottjer,51Cristina Calestani,53EricH.Davidson,3
Kevin Peterson,54Elly Chow,55Qiu Autumn Yuan55
Genome analysis [GC content]: Eran Elhaik,56Christine G.
Elsik,28Dan Graur,56Justin T. Reese28
Genome FPC map: Ian Bosdet,57Shin Heesun,57Marco A.
Human genetic disease orthologs: Michael Dean9(leader),
Amro Hamdoun13(leader), The Sea Urchin Genome Sequenc-
Immunity: Jonathan P. Rast21(leader), L. Courtney Smith23
(leader), Michele K. Anderson,22Kevin Berney,3Virginia
Brockton,23Katherine M. Buckley,23R. Andrew Cameron,3Avis
H. Cohen,58Sebastian D. Fugmann,59Taku Hibino,21Mariano
Loza-Coll,21Audrey J. Majeske,23Cynthia Messier,21Sham V.
Nair,60Zeev Pancer,61David P. Terwilliger22
Neurobiology and sensory systems: Robert D. Burke7(leader),
Maurice R. Elphick10(leader), William H. Klein15(leader),
Michael C. Thorndyke24(leader),Cavit Agca,62Lynne M. Angerer,4
Enrique Arboleda,5Maria Ina Arnone,5Bruce P. Brandhorst,50
Nansheng Chen,50Allison M. Churcher,63F. Hallböök,64Glen W.
Humphrey,65Richard O. Hynes,14Mohammed M. Idris,5Takae
Kiyama,15Shuguang Liang,15Dan Mellott,60Xiuqian Mu,15Greg
John S. Taylor,63Kristin Tessmar-Raible,66D. Wang,63Karen H.
Reproduction: Kathy R. Foltz12(leader), Victor D. Vacquier25
(leader), Gary M. Wessel26(leader), Terry Gaasterland,25Blanca E.
Galindo,67Herath J. Gunaratne,25Meredith Howard-Ashby,3Glen
W. Humphrey,65Celina Juliano,26Masashi Kinukawa,25Gary W.
Reade,12Michelle M. Roux,12Jia L. Song,25Yi-Hsien Su,3Ian K.
Townley,12Ekaterina Voronina,26Julian L. Wong26
Sea Urchin Genome Annotation Workshop in Naples: Maria
Ina Arnone5(leader), Michael C. Thorndyke24(leader), Gabriele
Amore,5Lynne M. Angerer,4Enrique Arboleda,5Margherita
Branno,5Euan R. Brown,5Vincenzo Cavalieri,69Véronique
Duboc,70Louise Duloquin,70Maurice R. Elphick,10Constantin
Flytzanis,70,71Christian Gache,70Anne-Marie Genevière,40,41
Mohammed M. Idris,5François Lapraz,70Thierry Lepage,70
Annamaria Locascio,5Pedro Martinez,73,74Giorgio Matassi,75
J. Poustka,33Florian Raible,66,67Ryan Range,70Francesca Rizzo,5
Eric Röttinger,70Matthew Rowe,10Kristin Tessmar-Raible,66Erica
Sodergren,1,2George M. Weinstock,1,2Karen Wilson24
Signal transduction: David R. McClay17(leader), Lynne M.
Christine Byrum,17,78Jenifer Croce,17Veronique Duboc,70Louise
Duloquin,70Christian Gache,70Anne-Marie Genevière,40,41
Tom Glenn,17Taku Hibino,22Sofia Hussain,37François Lapraz,70
Thierry Lepage,70Brian T. Livingston,37Mariano Loza,21Gerard
Röttinger,70Rebecca Thomason,17,78Katherine Walton,17Zheng
Wei,4Gary M. Wessel,26Athula Wikramanayke,77Karen H.
Wilson,23Charles Whittaker,35Shu-Yu Wu,17Ronghui Xu78
Transcription regulatory factors: Eric H. Davidson3(leader),
Cameron,3Lili Chen,3Rachel F. Gray,3Meredith Howard-Ashby,3
Sorin Istrail,46Pei Yun Lee,3Annamaria Locascio,5Pedro
Martinez,73,74Stefan C. Materna,3Jongmin Nam,3Paola Oliveri,3
Francesca Rizzo,5Joel Smith3
DNA sequencing: Donna Muzny1,2(leader), Erica Sodergren1,2
(leader), Richard A. Gibbs1,2(leader), George M. Weinstock1,2
(leader), Stephanie Bell,1,2Joseph Chacko,1,2Andrew Cree,1,2
Stacey Curry,1,2Clay Davis,1,2Huyen Dinh,1,2Shannon Dugan-
Hernandez,1,2Sandra Hines,1,2Jennifer Hume,1,2LaRonda
Jackson,1,2Angela Jolivet,1,2Christie Kovar,1,2Sandra Lee,1,2Lora
Lewis,1,2George Miner,1,2Margaret Morgan,1,2Lynne V.
Nazareth,1,2Geoffrey Okwuonu,1,2David Parker,1,2Ling-Ling
Pu,1,2Yufeng Shen,1,2Rachel Thorn,1,2Rita Wright1,2
1Human Genome Sequencing Center, Baylor College of
Medicine, One Baylor Plaza, Houston, TX 77030, USA.
2Department of Molecular and Human Genetics, Baylor
College of Medicine, One Baylor Plaza, Houston, TX 77030,
USA.3Division of Biology, California Institute of Technol-
ogy, Pasadena, CA 91125, USA.
Dental and Craniofacial Research, NIH, Bethesda, MD
4National Institute of
VOL 314 10 NOVEMBER 2006
on April 5, 2007
Comunale, 80121 Napoli, Italy.6Department of Biology,
Boston College, Chestnut Hill, MA 02467, USA.7Depart-
ment of Biology, Department of Biochemistry and
Microbiology, University of Victoria, Victoria, BC, Canada,
Salisbury Cove, ME 04672, USA.9Human Genetics Section,
Laboratory of Genomic Diversity, National Cancer Institute–
Frederick, Frederick, MD 21702, USA.10School of Biological
and Chemical Sciences, Queen Mary, University of London,
London E1 4NS, UK.11Department of Biological Sciences,
Carnegie Mellon University, Pittsburgh, PA, 15213, USA.
12Department Molecular, Cellular and Developmental
Biology and the Marine Science Institute, University of
California, Santa Barbara, Santa Barbara, CA 93106–9610,
USA.13Hopkins Marine Station, Stanford University, Pacific
Grove, CA 93950, USA.14Howard Hughes Medical Institute,
Center for Cancer Research, Massachusetts Institute of
Technology (MIT), Cambridge, MA 02139, USA.15Depart-
ments of Biochemistry and Molecular Biology, University of
Texas, M. D. Anderson Cancer Center, Houston, TX, 77030,
USA.16Molecular Biology and Biotechnology, University of
North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA.
17Department of Biology, Duke University, Durham, NC
27708, USA.18Department of Biology, Wheaton College,
Norton, MA 02766, USA.
Research, Kansas City, MO 64110, USA.20Department of
Microbiology, Kansas University Medical Center, Kansas
City, KS 66160, USA.21Sunnybrook Research Institute and
Department of Medical Biophysics, University of Toronto,
Toronto, Ontario, Canada M4N 3M5.
Immunology, University of Toronto, Toronto, Ontario,
Canada, M4N 3M5.23Department of Biological Sciences,
George Washington University, Washington, DC 20052,
Marine Research Station, Fiskebackskil, 450 34, Sweden.
25Marine Biology, Scripps Institution of Oceanography,
University of California San Diego, La Jolla, CA 92093–
0202, USA.26Department of Molecular and Cellular Biology
and Biochemistry, Brown University Providence, RI 02912,
Sciences and Policy, Duke University, Durham, NC 27708,
University, College Station, TX 77843, USA.
Center for Biotechnology Information, National Library of
Medicine, NIH, Bethesda, MD 20894, USA.30Department
of Ecology, Evolution, and Marine Biology, University of
California Santa Barbara, Santa Barbara, CA 93106, USA.
31National Center for Biotechnology Information, NIH,
Bethesda, MD 20892, USA.
University of Pennsylvania, Philadelphia, PA 19104, USA.
33Evolution and Development Group, Max-Planck Institut
für Molekulare Genetik, 14195 Berlin, Germany.34Royal
Holloway, University of London, Egham, Surrey TW20 0EX,
02139, USA.36Department of Molecular and Cell Biology,
University of California, Berkeley, Berkeley, CA 94720–
3200, USA.37Department of Biology, University of South
Florida, Tampa, FL 33618, USA.38Université Pierre et Marie
Curie (Paris 6), UMR 7150, Equipe Cycle Cellulaire et
Développement, Station Biologique de Roscoff, 29682
Roscoff Cedex, France.
Biologique de Roscoff, 29682 Roscoff Cedex, France.
40CNRS, UMR7628, Banyuls-sur-Mer, F-66650, France.
41Université Pierre et Marie Curie (Paris 6), UMR7628,
Banyuls-sur-Mer, F-66650, France.
matics, University of North Carolina at Chapel Hill, Chapel
Hill, NC 27599, USA.43Biology Department, Woods Hole
Oceanographic Institution, Woods Hole, MA 02543, USA.
44Tethys Research, LLC, 2115 Union Street, Bangor, Maine
Developmental Biology, University of California, Berkeley,
Berkeley, CA 94720, USA.
Molecular Biology, and Computer Science Department,
Brown University, Providence, RI 02912, USA.47Genome
Research Facility, National Aeronautics and Space Admin-
istration, Ames Research Center, Moffet Field, CA 94035,
5Stazione Zoologica Anton Dohrn, Villa
8Mount Desert Island Biological Laboratory,
19Stowers Institute for Medical
24Royal Swedish Academy of Sciences, Kristineberg
27Department of Biology and Institute for Genome
28Department of Animal Science, Texas A&M
32Penn Genomics Institute,
35Center for Cancer Research, MIT, Cambridge, MA
39CNRS, UMR 7150, Station
42Center for Bioinfor-
45Department of Molecular, Cellular, and
46Center for Computational
49Department of Molecular Biology and Biochemistry,
Simon Fraser University, Burnaby, British Columbia,
Canada, V5A 1S6.50Department of Molecular Biology and
Biochemistry, Simon Fraser University, Burnaby, BC,
Canada, V5A 1S6.
Cancer Research, MIT, Cambridge, MA 02139, USA.
52Department of Earth Sciences, University of Southern
California, Los Angeles, CA 90089–0740, USA.53Depart-
ment of Biology, University of Central Florida, Orlando, FL
Dartmouth College, Hanover, NH 03755, USA.55Center for
Computational Regulatory Genomics, Beckman Institute,
California Institute of Technology, Pasadena, CA 91125,
USA.56Department of Biology and Biochemistry, University
of Houston, Houston, TX 77204, USA.57Genome Sciences
Centre, British Columbia Cancer Agency, Vancouver, BC,
Canada, V5Z 4E6.58Department of Biology and the Institute
of Systems Research, University of Maryland, College Park,
MD 20742, USA.
Biology, National Institute on Aging, NIH, Baltimore, MD
University, Sydney NSW 2109, Australia.61Center of Ma-
rine Biotechnology, UMBI, Columbus Center, Baltimore,
MD 21202, USA.
Anatomy, Louisiana State University Health Sciences Center,
New Orleans, LA 70112, USA.63Department of Biology,
University of Victoria, Victoria, BC, Canada, V8W 2Y2.
64Department of Neuroscience, Uppsala University, Uppsala,
Sweden.65Laboratory of Cellular and Molecular Biophys-
ics, National Institute of Child Health and Development,
NIH, Bethesda, MD 20895, USA.
EMBL, 69117 Heidelberg, Germany.67Computational Unit,
EMBL, 69117 Heidelberg, Germany.68Biotechnology Insti-
48Systemix Institute, Cupertino, CA 95014, USA.
51Department of Biology, Center for
54Department of Biological Sciences,
59Laboratory of Cellular and Molecular
62Department of Cell Biology and
tute, Universidad Nacional Autónoma de Mexico (UNAM),
and Developmental Biology WAlberto Monroy,W University of
Palermo, 90146 Palermo, Italy.70Laboratoire de Biologie
du Développement (UMR 7009), CNRS and Université
Pierre et Marie Curie (Paris 6), Observatoire Océanologique,
06230 Villefranche-sur-Mer, France.71Department of Biol-
ogy, University of Patras, Patras, Greece.72Department of
Molecular and Cellular Biology, Baylor College of Medicine,
Genetica, Universitat de Barcelona, 08028–Barcelona, Spain.
74Institució Catalana de Recerca i Estudis Avancats (ICREA),
Barcelona, Spain.75Institut Jacques Monod, CNR-UMR 7592,
75005 Paris, France.76Consiglio Nazionale delle Ricerche,
Istituto di Biomedicina e Immunologia Molecolare WAlberto
Monroy,W 90146 Palermo, Italy.77Razavi-Newman Center
for Bioinformatics, Salk Institute for Biological Studies, La
Jolla, CA 92186, USA.78Department of Zoology, University
of Hawaii at Manoa, Honolulu, HI 96822, USA.
*Present address: GlaxoSmithKline, 1250 South College-
ville Road, Collegeville, PA 19426, USA.
†Present address: Massachusetts General Hospital Cancer
Center, Charlestown, MA 02129, USA.
Supporting Online Material
Materials and Methods
Figs. S1 to S6
Tables S1 to S8
8 August 2006; accepted 17 October 2006
Genomic Insights into the Immune
System of the Sea Urchin
Jonathan P. Rast,1* L. Courtney Smith,2Mariano Loza-Coll,1Taku Hibino,1Gary W. Litman3,4
Comparative analysis of the sea urchin genome has broad implications for the primitive state of
deuterostome host defense and the genetic underpinnings of immunity in vertebrates. The sea
urchin has an unprecedented complexity of innate immune recognition receptors relative to other
animal species yet characterized. These receptor genes include a vast repertoire of 222 Toll-like
receptors, a superfamily of more than 200 NACHT domain–leucine-rich repeat proteins (similar to
nucleotide-binding and oligomerization domain (NOD) and NALP proteins of vertebrates), and a
large family of scavenger receptor cysteine-rich proteins. More typical numbers of genes encode
other immune recognition factors. Homologs of important immune and hematopoietic regulators,
many of which have previously been identified only from chordates, as well as genes that are
critical in adaptive immunity of jawed vertebrates, also are present. The findings serve to
underscore the dynamic utilization of receptors and the complexity of immune recognition that
may be basal for deuterostomes and predicts features of the ancestral bilaterian form.
somatic diversification and selective clonal pro-
ificity is germline encoded. Collectively, these
systems act to protect the individual from
invasive bacteria, viruses, and eukaryotic patho-
gens by detecting molecular signatures of
infection and initiating effector responses. Innate
immune mechanisms probably originated early
as acquired (adaptive), in which immune
recognition specificity is the product of
in animal phylogeny and are closely allied with
In many cases, their constituent elements are
distributed throughout the cells of the organism.
In bilaterally symmetrical animals (Bilateria),
immune defense is carried out and tightly
coordinated by a specialized set of mesoderm-
of developmental and immune programs are a
variety of rapidly evolving recognition and
10 NOVEMBER 2006VOL 314
Sea Urchin Genome
on April 5, 2007
ERRATUM POST DATE 9 FEBRUARY 2007
Research Articles:“The genome of the sea urchin Strongylocentrotus purpuratus” by Sea
Urchin Genome Sequencing Consortium (10 Nov. 2006, p. 941). On pages 951 and 952,
errors were made in renumbering authors’ affiliations: Some changes were missed, and the
affiliation for Nikki Adams was omitted. C. G. Elsik, T. Hibino, and V. D. Vacquier appear twice.
C. G. Elsik is at Texas A&M University; T. Hibino is at the Sunnybrook Research Institute and
Department of Medical Biophysics, University of Toronto; V. D. Vacquier is at the Scripps Institu-
tion of Oceanography. Corrected group affiliations, then individuals alphabetically: P. Kitts, M.
J. Landrum, D. Maglott, K. Pruitt, A. Souvorov, National Center for Biotechnology Information,
National Library of Medicine, Bethesda, MD 20894, USA. O. Fedrigo, A. Primus, R. Satija,
Department of Biology and Institute for Genome Sciences and Policy, Duke University, Durham,
NC 27708, USA. Nikki Adams, Biology Department, California Polytechnic State University, San
Luis Obispo, CA93407, USA. C. Flytzanis, Department of Biology, University of Patras, Patras,
Greece, and the Department of Molecular and Cellular Biology, Baylor College of Medicine,
One Baylor Plaza, Houston, TX 77030, USA. B. E. Galindo, Biotechnology Institute, Universidad
Nacional Autónoma de Mexico (UNAM), Cuernavaca, Morelos, Mexico 62250. J. V. Goldstone,
Department of Molecular, Cellular, and Developmental Biology, University of California, Berke-
ley, Berkeley, CA94720, USA. G. Manning, Razavi-Newman Center for Bioinformatics, Salk
Institute for Biological Studies, La Jolla, CA92186, USA. D. Mellott, Center of Marine Biotech-
nology, University of Maryland Biotechnology Institute, Columbus Center, Baltimore, MD
21202, USA. J. Song, Department of Molecular and Cellular Biology and Biochemistry, Brown
University, Providence, RI 02912, USA. D. P. Terwilliger, Department of Biological Sciences,
George Washington University, Washington, DC 20052, USA. A. Wikramanayake, Department
of Zoology, University of Hawaii at Manoa, Honolulu, HI 96822, USA.
Post date 9 February 2007
on April 5, 2007