Shedding genomic light on Aristotle's lantern

Human Genome Sequencing Center, Baylor College of Medicine, One Baylor Plaza, Alkek N1519, Houston, TX 77030, USA.
Developmental Biology (Impact Factor: 3.55). 01/2007; 300(1):2-8. DOI: 10.1016/j.ydbio.2006.10.005
Source: PubMed


Sea urchins have proved fascinating to biologists since the time of Aristotle who compared the appearance of their bony mouth structure to a lantern in The History of Animals. Throughout modern times it has been a model system for research in developmental biology. Now, the genome of the sea urchin Strongylocentrotus purpuratus is the first echinoderm genome to be sequenced. A high quality draft sequence assembly was produced using the Atlas assembler to combine whole genome shotgun sequences with sequences from a collection of BACs selected to form a minimal tiling path along the genome. A formidable challenge was presented by the high degree of heterozygosity between the two haplotypes of the selected male representative of this marine organism. This was overcome by use of the BAC tiling path backbone, in which each BAC represents a single haplotype, as well as by improvements in the Atlas software. Another innovation introduced in this project was the sequencing of pools of tiling path BACs rather than individual BAC sequencing. The Clone-Array Pooled Shotgun Strategy greatly reduced the cost and time devoted to preparing shotgun libraries from BAC clones. The genome sequence was analyzed with several gene prediction methods to produce a comprehensive gene list that was then manually refined and annotated by a volunteer team of sea urchin experts. This latter annotation community edited over 9000 gene models and uncovered many unexpected aspects of the sea urchin genetic content impacting transcriptional regulation, immunology, sensory perception, and an organism's development. Analysis of the basic deuterostome genetic complement supports the sea urchin's role as a model system for deuterostome and, by extension, chordate development.

Download full-text


Available from: George Weinstock,
28 Reads
  • Source
    • "For this and the many other areas of contemporary molecular and cell biology in which S. purpuratus plays a prominent role, progress is directly affected by the accuracy of the annotated gene models available in the current genome builds. The initial set of gene models was obtained by merging four different sequence-based approaches to computational gene prediction: Ensembl pipeline, NCBI gnomon, FgenesH, and Genscan, using the GLEAN algorithm (Sodergren et al. 2006). The GLEAN result was evaluated using a set of ;600 cDNA/ESTs that were not used in any of the gene prediction programs, and was considered better than any single algorithm. "
    [Show abstract] [Hide abstract]
    ABSTRACT: A comprehensive transcriptome analysis has been performed on protein-coding RNAs of Strongylocentrotus purpuratus, including 10 different embryonic stages, six feeding larval and metamorphosed juvenile stages, and six adult tissues. In this study, we pooled the transcriptomes from all of these sources and focused on the insights they provide for gene structure in the genome of this recently sequenced model system. The genome had initially been annotated by use of computational gene model prediction algorithms. A large fraction of these predicted genes were recovered in the transcriptome when the reads were mapped to the genome and appropriately filtered and analyzed. However, in a manually curated subset, we discovered that more than half the computational gene model predictions were imperfect, containing errors such as missing exons, prediction of nonexistent exons, erroneous intron/exon boundaries, fusion of adjacent genes, and prediction of multiple genes from single genes. The transcriptome data have been used to provide a systematic upgrade of the gene model predictions throughout the genome, very greatly improving the research usability of the genomic sequence. We have constructed new public databases that incorporate information from the transcriptome analyses. The transcript-based gene model data were used to define average structural parameters for S. purpuratus protein-coding genes. In addition, we constructed a custom sea urchin gene ontology, and assigned about 7000 different annotated transcripts to 24 functional classes. Strong correlations became evident between given functional ontology classes and structural properties, including gene size, exon number, and exon and intron size.
    Genome Research 06/2012; 22(10):2079-87. DOI:10.1101/gr.139170.112 · 14.63 Impact Factor
  • Source
    • "Smith). 1 Current address: Center for Infection and Immunity, Mailman School of Public Health, Columbia University, New York, NY, United States. oderm phylum, are easily obtained, housed, handled, spawned and otherwise manipulated, and are frequently used in investigations of early development (Davidson, 2006; Sodergren et al., 2006a,b). The echinoderms, as the sister phylum to the chordates, are an important basal group for making evolutionary inferences about the immune system in deuterostomes (reviewed in Rast and Messier-Solek, 2008). "
    [Show abstract] [Hide abstract]
    ABSTRACT: A full length cDNA sequence expressed in coelomocytes shows significant sequence match to vertebrate Tie1 and Tie2/TEK. Vertebrate Tie2/TEK is the receptor for the angiopoietins and plays an important role in angiogenesis and hematopoiesis, whereas Tie1 regulates the activity of Tie2. The deduced sequence of the SpTie1/2 protein has a similar order and organization of domains to the homologous vertebrate proteins including a highly conserved receptor tyrosine kinase domain in the cytoplasmic tail. The N terminus of the ectodomain has one immunoglobulin (Ig)-Tie2_1 domain, followed by an Ig domain, four epidermal growth factor domains, a second Ig domain, and three fibronectin type III domains. The SpTie1/2 gene is expressed in coelomocytes and the axial organ, whereas other organs do not show significant expression. The timing of embryonic expression corresponds with the differentiation of blastocoelar cells, the embryonic and larval immune cells. Searches of the sea urchin genome show several gene models encoding putative ligands and signaling proteins that might interact with SpTie1/2. We speculate that SpTie1/2 may be involved in the proliferation of sea urchin immune cells in both adults and embryos.
    Developmental and comparative immunology 08/2010; 34(8):884-895. DOI:10.1016/j.dci.2010.03.010 · 2.82 Impact Factor
  • Source
    • "The sea urchin genome sequence presents some unique problems for the bio-informatician. The high degree of polymorphism in the genome required special considerations in the assembly process (3). The draft sequence is a mosaic of two haplotypes estimated from the assembly to differ by about 2%. "
    [Show abstract] [Hide abstract]
    ABSTRACT: SpBase is a system of databases focused on the genomic information from sea urchins and related echinoderms. It is exposed to the public through a web site served with open source software ( The enterprise was undertaken to provide an easily used collection of information to directly support experimental work on these useful research models in cell and developmental biology. The information served from the databases emerges from the draft genomic sequence of the purple sea urchin, Strongylocentrotus purpuratus and includes sequence data and genomic resource descriptions for other members of the echinoderm clade which in total span 540 million years of evolutionary time. This version of the system contains two assemblies of the purple sea urchin genome, associated expressed sequences, gene annotations and accessory resources. Search mechanisms for the sequences and the gene annotations are provided. Because the system is maintained along with the Sea Urchin Genome resource, a database of sequenced clones is also provided.
    Nucleic Acids Research 12/2008; 37(Database issue):D750-4. DOI:10.1093/nar/gkn887 · 9.11 Impact Factor
Show more