-
Juan I Montoya-Burgos
[show abstract]
[hide abstract]
ABSTRACT: Recent genome-wide analyses have revealed patterns of positive selection acting on protein-coding genes in humans and mammals. To assess whether the conclusions drawn from these analyses are valid for other vertebrates and to identify mammalian specificities, I have investigated the selective pressure acting on protein-coding genes of the puffer fishes Tetraodon and Takifugu. My results indicate that the strength of purifying selection in puffer fishes is similar to previous reports for murids but stronger in hominids, which have a smaller population size. Gene ontology analyses show that more than half of the biological processes targeted by positive selection in mammals are also targeted in puffer fishes, highlighting general patterns for vertebrates. Biological processes enriched with positively selected genes that are shared between mammals and fishes include immune and defense responses, signal transduction, regulation of transcription and several of their descendent terms. Mammalian-specific processes displaying an excess of positively selected genes are related to sensory perception and neurological processes. The comparative analyses also revealed that, for both mammals and fishes, genes encoding extracellular proteins are preferentially targeted by positive selection, indicating that adaptive evolution occurs more often in the extra-cellular environment rather than inside the cell. Moreover, I present here the first genome-wide characterization of neutrally-evolving regions of protein-coding genes. This analysis revealed an unexpectedly high proportion of genes containing both positively selected motifs and neutrally-evolving regions, uncovering a strong link between neutral evolution and positive selection. I speculate that neutrally-evolving regions are a major source of novelties screened by natural selection.
PLoS ONE 01/2011; 6(9):e24800. · 4.09 Impact Factor
-
[show abstract]
[hide abstract]
ABSTRACT: Hox genes are central to the specification of structures along the anterior-posterior body axis, and modifications in their expression have paralleled the emergence of diversity in vertebrate body plans. Here we describe the genomic organization of Hox clusters in different reptiles and show that squamates have accumulated unusually large numbers of transposable elements at these loci, reflecting extensive genomic rearrangements of coding and non-coding regulatory regions. Comparative expression analyses between two species showing different axial skeletons, the corn snake and the whiptail lizard, revealed major alterations in Hox13 and Hox10 expression features during snake somitogenesis, in line with the expansion of both caudal and thoracic regions. Variations in both protein sequences and regulatory modalities of posterior Hox genes suggest how this genetic system has dealt with its intrinsic collinear constraint to accompany the substantial morphological radiation observed in this group.
Nature 03/2010; 464(7285):99-103. · 36.28 Impact Factor
-
Molecular Phylogenetics and Evolution 11/2009; · 3.61 Impact Factor
-
[show abstract]
[hide abstract]
ABSTRACT: Hox genes control many aspects of embryonic development in metazoans. Previous analyses of this gene family revealed a surprising diversity in terms of gene number and organization between various animal species. In vertebrates, Hox genes are grouped into tightly organized clusters, claimed to be devoid of repetitive sequences. Here, we report the genomic organization of the four Hox loci present in the green anole lizard and show that they have massively accumulated retrotransposons, leading to gene clusters larger in size when compared to other vertebrates. In addition, similar repeats are present in many other development-related gene-containing regions, also thought to be refractory to such repetitive elements. Transposable elements are major sources of genetic variations, including alterations of gene expression, and hence this situation, so far unique among vertebrates, may have been associated with the evolution of the spectacular realm of morphological variations in the body plans of Squamata. Finally, sequence alignments highlight some divergent evolution in highly conserved DNA regions between vertebrate Hox clusters, which may coincide with the emergence of mammalian-specific features.
Genome Research 03/2009; 19(4):602-10. · 13.61 Impact Factor
-
[show abstract]
[hide abstract]
ABSTRACT: Neotropical freshwater fishes have reached an unrivalled diversity, organized into several areas of endemism, yet the underlying processes are still largely unknown. The topographical and ecological characteristics of the Guyanas Region make it an ideal area of endemism in which to investigate the forces that have shaped this great diversity. This region is thought to be inhabited by species descending from Amazonian ancestors, which would have used two documented routes that, however, hardly explain the entrance of species adapted to running waters. Here, we investigate the evolutionary history of Pseudancistrus brevispinis, a catfish endemic to this region and exclusively found in running waters, thus making it an ideal model for investigating colonization routes and dispersal in such habitats. Our analyses, based on mitochondrial and nuclear markers, revealed an unexpected diversity consisting of six monophyletic lineages within P. brevispinis, showing a disjoint distribution pattern. The lineages endemic to Guyanas coastal rivers form a monophyletic group that originated via an ancestral colonization event from the Amazon basin. Evidence given favours a colonization pathway through river capture between an Amazonian tributary and the Upper Maroni River. Population genetic analyses of the most widespread species indicate that subsequent dispersal among Guyanas coastal rivers occurred principally by temporary connections between adjacent rivers during periods of lower sea level, yet instances of dispersal via interbasin river captures are not excluded. During high sea level intervals, the isolated populations would have diverged leading to the observed allopatric species. This evolutionary process is named the sea level fluctuation (SLF) hypothesis of diversification.
Molecular Ecology 02/2009; 18(5):947-64. · 5.52 Impact Factor
-
[show abstract]
[hide abstract]
ABSTRACT: The Neotropics possess the greatest freshwater fish diversity of the world, rendering the study of their evolutionary history extremely challenging. Loricariidae catfishes are one of the most diverse components of the Neotropical ichthyofauna and despite a long history of classification, major issues still need elucidation. Based on a nuclear gene, we present a robust phylogeny of two former loricariid subfamilies: Hypoptopomatinae and Neoplecostominae. Our results show that Neoplecostominae is nested within Hypoptopomatinae, and is the sister group to the former Otothyrini tribe. According to our results, supplemented by morphological observations, we erect two new subfamilies, the Otothyrinae and a new Hypoptopomatinae, and modify the Neoplecostominae by including the genus Pseudotocinclus. The uncovered evolutionary relationships allow a detailed analysis of their historical biogeography. We tested two Dispersal-Extinction-Cladogenesis models for inferring the distribution range evolution of the new subfamilies, and show that the model having no constrains performs better than a model constraining long-range dispersal. The Maximum Likelihood reconstructions of ancestral ranges showed a marked division between the Amazonian origin of the Hypoptopomatinae and the eastern coastal Brazil+Upper Paraná origin of the Neoplecostominae and Otothyrinae. Markedly few instances of dispersal across the border separating the Amazon basin and the Paraná-Paraguay+eastern coastal Brazil+Uruguay were reconstructed. This result is in clear contrast with the historical biogeography of many Neotropical fishes, including other Loricariidae. Part of the dispersal limitation may be explained by divergent ecological specialization: lowland rivers versus mountain streams habitats. Moreover, because most species of the new subfamilies are small, we hypothesize that body size-related effects might limit their dispersal, like predation and energetic cost to migration. Finally, morphological and anatomical features are presented that limit or, to the contrary, enhance dispersal capability in these small and fascinating catfishes.
Molecular Phylogenetics and Evolution 09/2008; 49(2):606-17. · 3.61 Impact Factor
-
[show abstract]
[hide abstract]
ABSTRACT: With the increase of laboratory facilities, molecular phylogenies are playing a predominant role in evolutionary analyses. However, understanding the evolution of morphological traits remains essential for a comprehensive view of the evolution of a group. Here we present a new approach based on co-inertia analysis for identifying characters which variations are dependent to the phylogeny, a prerequisite for analyzing the evolution of characters. Our approach has the advantage of treating the full data set at once, including qualitative and quantitative variables. It provides a graphical output giving the contribution of each variable to the co-structure, allowing a direct discrimination among phylogenetically dependent and independent variables. We have implemented this approach in deciphering the evolution of morphological traits in a highly specialized group of Neotropical catfishes: the Loricariinae. We have first inferred a molecular phylogeny of this group based on the 12S and 16S mitochondrial genes. The resulting phylogeny indicated that the subtribe Harttiini was restricted to the single genus Harttia, and within the subtribe Loricariini, two sister subtribes were distinguished, Sturisomina (new subtribe), and Loricariina. Among Loricariina, the morphological groups Loricariichthys and Loricaria+Pseudohemiodon were confirmed. The co-inertia analysis highlighted a strong relationship between the morphological and the genetic data sets, and identified three quantitative and eight qualitative variables linked to the phylogeny. The evolution of quantitative variables was assessed using the orthogram method and showed a major punctual event in the evolution of the number of caudal-fin rays, and a more gradual pattern of evolution of the number of teeth along the phylogeny. The evolution of qualitative variables was inferred using ancestral states reconstructions and highlighted parallel patterns of evolution in characters linked to the mouth, suggesting co-evolution of the traits for adapting to divergent substrates.
Molecular Phylogenetics and Evolution 04/2008; 46(3):986-1002. · 3.61 Impact Factor
-
[show abstract]
[hide abstract]
ABSTRACT: A comprehensive phylogenetic framework is indispensable for investigating the evolution of genomic features in mammals as a whole, and particularly in humans. Using the ENCODE sequence data, we estimated mammalian neutral evolutionary rates and selective pressures acting on conserved coding and noncoding elements. We show that neutral evolutionary rates can be explained by the generation time (GT) hypothesis. Accordingly, primates (especially humans), having longer GTs than other mammals, display slower rates of neutral evolution. The evolution of constrained elements, particularly of nonsynonymous sites, is in agreement with the expectations of the nearly neutral theory of molecular evolution. We show that rates of nonsynonymous substitutions (dN) depend on the population size of a species. The results are robust to the exclusion of hypermutable CpG prone sites. The average rate of evolution in conserved noncoding sequences (CNCs) is 1.7 times higher than in nonsynonymous sites. Despite this, CNCs evolve at similar or even lower rates than nonsynonymous sites in the majority of basal branches of the eutherian tree. This observation could be the result of an overall gradual or, alternatively, lineage-specific relaxation of CNCs. The latter hypothesis was supported by the finding that 3 of the 20 longest CNCs displayed significant relaxation of individual branches. This observation may explain why the evolution of CNCs fits the expectations of the nearly neutral theory less well than the evolution of nonsynonymous sites.
Proceedings of the National Academy of Sciences 01/2008; 104(51):20443-8. · 9.68 Impact Factor
-
Elliott H Margulies,
Gregory M Cooper,
George Asimenos,
Daryl J Thomas,
Colin N Dewey,
Adam Siepel,
Ewan Birney,
Damian Keefe,
Ariel S Schwartz,
Minmei Hou, [......],
Marco A Marra,
Stylianos E Antonarakis,
Serafim Batzoglou,
Nick Goldman,
Ross Hardison,
David Haussler,
Webb Miller,
Lior Pachter,
Eric D Green,
Arend Sidow
[show abstract]
[hide abstract]
ABSTRACT: A key component of the ongoing ENCODE project involves rigorous comparative sequence analyses for the initially targeted 1% of the human genome. Here, we present orthologous sequence generation, alignment, and evolutionary constraint analyses of 23 mammalian species for all ENCODE targets. Alignments were generated using four different methods; comparisons of these methods reveal large-scale consistency but substantial differences in terms of small genomic rearrangements, sensitivity (sequence coverage), and specificity (alignment accuracy). We describe the quantitative and qualitative trade-offs concomitant with alignment method choice and the levels of technical error that need to be accounted for in applications that require multisequence alignments. Using the generated alignments, we identified constrained regions using three different methods. While the different constraint-detecting methods are in general agreement, there are important discrepancies relating to both the underlying alignments and the specific algorithms. However, by integrating the results across the alignments and constraint-detecting methods, we produced constraint annotations that were found to be robust based on multiple independent measures. Analyses of these annotations illustrate that most classes of experimentally annotated functional elements are enriched for constrained sequences; however, large portions of each class (with the exception of protein-coding sequences) do not overlap constrained regions. The latter elements might not be under primary sequence constraint, might not be constrained across all mammals, or might have expendable molecular functions. Conversely, 40% of the constrained sequences do not overlap any of the functional elements that have been experimentally identified. Together, these findings demonstrate and quantify how many genomic functional elements await basic molecular characterization.
Genome Research 07/2007; 17(6):760-74. · 13.61 Impact Factor
-
Ewan Birney,
John A Stamatoyannopoulos,
Anindya Dutta,
Roderic Guigó,
Thomas R Gingeras,
Elliott H Margulies,
Zhiping Weng,
Michael Snyder,
Emmanouil T Dermitzakis,
Robert E Thurman, [......],
David B Jaffe,
Jean L Chang,
Kerstin Lindblad-Toh,
Eric S Lander,
Maxim Koriabine,
Mikhail Nefedov,
Kazutoyo Osoegawa,
Yuko Yoshinaga,
Baoli Zhu,
Pieter J de Jong
[show abstract]
[hide abstract]
ABSTRACT: We report the generation and analysis of functional data from multiple, diverse experiments performed on a targeted 1% of the human genome as part of the pilot phase of the ENCODE Project. These data have been further integrated and augmented by a number of evolutionary and computational analyses. Together, our results advance the collective knowledge about human genome function in several major areas. First, our studies provide convincing evidence that the genome is pervasively transcribed, such that the majority of its bases can be found in primary transcripts, including non-protein-coding transcripts, and those that extensively overlap one another. Second, systematic examination of transcriptional regulation has yielded new understanding about transcription start sites, including their relationship to specific regulatory sequences and features of chromatin accessibility and histone modification. Third, a more sophisticated view of chromatin structure has emerged, including its inter-relationship with DNA replication and transcriptional regulation. Finally, integration of these new sources of information, in particular with respect to mammalian evolution based on inter- and intra-species sequence comparisons, has yielded new mechanistic and evolutionary insights concerning the functional landscape of the human genome. Together, these studies are defining a path for pursuit of a more comprehensive characterization of human genome function.
Nature 07/2007; 447(7146):799-816. · 36.28 Impact Factor
-
Ewan Birney,
John A. Stamatoyannopoulos,
Anindya Dutta,
Roderic Guig|[oacute,
Thomas R. Gingeras,
Elliott H. Margulies,
Zhiping Weng,
Michael Snyder,
Emmanouil T. Dermitzakis,
Robert E. Thurman, [......],
David B. Jaffe,
Jean L. Chang,
Kerstin Lindblad-Toh,
Eric S. Lander,
Maxim Koriabine,
Mikhail Nefedov,
Kazutoyo Osoegawa,
Yuko Yoshinaga,
Baoli Zhu,
Pieter J. de Jong
[show abstract]
[hide abstract]
ABSTRACT: We report the generation and analysis of functional data from multiple, diverse experiments performed on a targeted 1% of the human genome as part of the pilot phase of the ENCODE Project. These data have been further integrated and augmented by a number of evolutionary and computational analyses. Together, our results advance the collective knowledge about human genome function in several major areas. First, our studies provide convincing evidence that the genome is pervasively transcribed, such that the majority of its bases can be found in primary transcripts, including non-protein-coding transcripts, and those that extensively overlap one another. Second, systematic examination of transcriptional regulation has yielded new understanding about transcription start sites, including their relationship to specific regulatory sequences and features of chromatin accessibility and histone modification. Third, a more sophisticated view of chromatin structure has emerged, including its inter-relationship with DNA replication and transcriptional regulation. Finally, integration of these new sources of information, in particular with respect to mammalian evolution based on inter- and intra-species sequence comparisons, has yielded new mechanistic and evolutionary insights concerning the functional landscape of the human genome. Together, these studies are defining a path for pursuit of a more comprehensive characterization of human genome function.
Nature 06/2007; 447(7146):799-816. · 36.28 Impact Factor
-
Ewan Birney,
John A. Stamatoyannopoulos,
Anindya Dutta,
Roderic Guig|[oacute,
Thomas R. Gingeras,
Elliott H. Margulies,
Zhiping Weng,
Michael Snyder,
Emmanouil T. Dermitzakis,
Robert E. Thurman, [......],
David B. Jaffe,
Jean L. Chang,
Kerstin Lindblad-Toh,
Eric S. Lander,
Maxim Koriabine,
Mikhail Nefedov,
Kazutoyo Osoegawa,
Yuko Yoshinaga,
Baoli Zhu,
Pieter J. de Jong
[show abstract]
[hide abstract]
ABSTRACT: We report the generation and analysis of functional data from multiple, diverse experiments performed on a targeted 1% of the human genome as part of the pilot phase of the ENCODE Project. These data have been further integrated and augmented by a number of evolutionary and computational analyses. Together, our results advance the collective knowledge about human genome function in several major areas. First, our studies provide convincing evidence that the genome is pervasively transcribed, such that the majority of its bases can be found in primary transcripts, including non-protein-coding transcripts, and those that extensively overlap one another. Second, systematic examination of transcriptional regulation has yielded new understanding about transcription start sites, including their relationship to specific regulatory sequences and features of chromatin accessibility and histone modification. Third, a more sophisticated view of chromatin structure has emerged, including its inter-relationship with DNA replication and transcriptional regulation. Finally, integration of these new sources of information, in particular with respect to mammalian evolution based on inter- and intra-species sequence comparisons, has yielded new mechanistic and evolutionary insights concerning the functional landscape of the human genome. Together, these studies are defining a path for pursuit of a more comprehensive characterization of human genome function.
Nature 06/2007; 447(7146):799-816. · 36.28 Impact Factor
-
Ewan Birney,
John A. Stamatoyannopoulos,
Anindya Dutta,
Roderic Guigó,
Thomas R. Gingeras,
Elliott H. Margulies,
Zhiping Weng,
Michael Snyder,
Emmanouil T. Dermitzakis,
Robert E. Thurman, [......],
David B. Jaffe,
Jean L. Chang,
Kerstin Lindblad-Toh,
Eric S. Lander,
Maxim Koriabine,
Mikhail Nefedov,
Kazutoyo Osoegawa,
Yuko Yoshinaga,
Baoli Zhu,
Pieter J. de Jong
[show abstract]
[hide abstract]
ABSTRACT: We report the generation and analysis of functional data from multiple,
diverse experiments performed on a targeted 1% of the human genome as
part of the pilot phase of the ENCODE Project. These data have been
further integrated and augmented by a number of evolutionary and
computational analyses. Together, our results advance the collective
knowledge about human genome function in several major areas. First, our
studies provide convincing evidence that the genome is pervasively
transcribed, such that the majority of its bases can be found in primary
transcripts, including non-protein-coding transcripts, and those that
extensively overlap one another. Second, systematic examination of
transcriptional regulation has yielded new understanding about
transcription start sites, including their relationship to specific
regulatory sequences and features of chromatin accessibility and histone
modification. Third, a more sophisticated view of chromatin structure
has emerged, including its inter-relationship with DNA replication and
transcriptional regulation. Finally, integration of these new sources of
information, in particular with respect to mammalian evolution based on
inter- and intra-species sequence comparisons, has yielded new
mechanistic and evolutionary insights concerning the functional
landscape of the human genome. Together, these studies are defining a
path for pursuit of a more comprehensive characterization of human
genome function.
Nature 05/2007; 447:799-816. · 36.28 Impact Factor
-
[show abstract]
[hide abstract]
ABSTRACT: The survival of vertebrate species is dependent on the ability of individuals to adequately interact with each other, a function often mediated by the olfactory system. Diverse olfactory receptor repertoires are used by this system to recognize chemicals. Among these receptors, the V1rs, encoded by a very large gene family in most mammals, are able to detect pheromones. Teleosts, which also express V1r receptors, possess a very limited V1r repertoire. Here, taking advantage of the possibility to unequivocally identify V1r orthologs in teleosts, we analyzed the olfactory expression and evolutionary constraints of a pair of clustered fish V1r receptor genes, V1r1 and V1r2. Orthologs of the two genes were found in zebrafish, medaka, and threespine stickleback, but a single representative was observed in tetraodontidae species. Analysis of V1r1 and V1r2 sequences from 12 different euteleost species indicate different evolutionary rates between the two paralogous genes, leading to a highly conserved V1r2 gene and a V1r1 gene under more relaxed selective constraint. Moreover, positively-selected sites were detected in specific branches of the V1r1 clade. Our results suggest a conserved agonist specificity of the V1R2 receptor between euteleost species, its loss in the tetraodontidae lineage, and the acquisition of different chemosensory characteristics for the V1R1 receptor.
PLoS ONE 02/2007; 2(4):e379. · 4.09 Impact Factor
-
[show abstract]
[hide abstract]
ABSTRACT: Understanding the early evolution of placental mammals is one of the most challenging issues in mammalian phylogeny. Here, we addressed this question by using the sequence data of the ENCODE consortium, which include 1% of mammalian genomes in 18 species belonging to all main mammalian lineages. Phylogenetic reconstructions based on an unprecedented amount of coding sequences taken from 218 genes resulted in a highly supported tree placing the root of Placentalia between Afrotheria and Exafroplacentalia (Afrotheria hypothesis). This topology was validated by the phylogenetic analysis of a new class of genomic phylogenetic markers, the conserved noncoding sequences. Applying the tests of alternative topologies on the coding sequence dataset resulted in the rejection of the Atlantogenata hypothesis (Xenarthra grouping with Afrotheria), while this test rejected the second alternative scenario, the Epitheria hypothesis (Xenarthra at the base), when using the noncoding sequence dataset. Thus, the two datasets support the Afrotheria hypothesis; however, none can reject both of the remaining topological alternatives.
PLoS Genetics 02/2007; 3(1):e2. · 8.69 Impact Factor
-
[show abstract]
[hide abstract]
ABSTRACT: Symbiotic dinoflagellates belonging to the genus Symbiodinium are found in association with a wide variety of shallow-water invertebrates and protists dwelling in tropical and subtropical coral-reef ecosystems. Molecular phylogeny of Symbiodinium, initially inferred using nuclear ribosomal genes, was recently confirmed by studies of chloroplastic and mitochondrial genes, but with limited taxon sampling and low resolution. Here, we present the first complete view of Symbiodinium phylogeny based on concatenated partial sequences of chloroplast 23S-rDNA (cp23S) and nuclear 28S-rDNA (nr28S) genes, including all known Symbiodinium lineages. Our data produced a well resolved phylogenetic tree and provide a strong statistical support for the eight distinctive clades (A-H) that form the major taxa of Symbiodinium. The relative-rate tests did not show particularly high differences between lineages and both analysed markers. However, maximum likelihood ratio tests rejected a global molecular clock. Therefore, we applied a relaxed molecular clock method to infer the divergence times of all extant lineages of Symbiodinium, calibrating its phylogenetic tree with the fossil record of soritid foraminifera. Our analysis suggests that Symbiodinium originated in early Eocene, and that the majority of extant lineages diversified since mid-Miocene, about 15 million years ago.
Molecular Phylogenetics and Evolution 02/2006; 38(1):20-30. · 3.61 Impact Factor
-
Ewan Birney,
John A Stamatoyannopoulos,
Anindya Dutta,
Roderic Guigó,
Thomas R Gingeras,
Elliott H Margulies,
Zhiping Weng,
Michael Snyder,
Emmanouil T Dermitzakis,
Robert E Thurman, [......],
David B Jaffe,
Jean L Chang,
Kerstin Lindblad-Toh,
Eric S Lander,
Maxim Koriabine,
Mikhail Nefedov,
Kazutoyo Osoegawa,
Yuko Yoshinaga,
Baoli Zhu,
Pieter J Jong
[show abstract]
[hide abstract]
ABSTRACT: We report the generation and analysis of functional data from multiple, diverse experiments performed on a targeted 1% of the human genome as part of the pilot phase of the ENCODE Project. These data have been further integrated and augmented by a number of evolutionary and computational analyses. Together, our results advance the collective knowledge about human genome function in several major areas. First, our studies provide convincing evidence that the genome is pervasively transcribed, such that the majority of its bases can be found in primary transcripts, including non-protein-coding transcripts, and those that extensively overlap one another. Second, systematic examination of transcriptional regulation has yielded new understanding about transcription start sites, including their relationship to specific regulatory sequences and features of chromatin accessibility and histone modification. Third, a more sophisticated view of chromatin structure has emerged, including its inter-relationship with DNA replication and transcriptional regulation. Finally, integration of these new sources of information, in particular with respect to mammalian evolution based on inter- and intra-species sequence comparisons, has yielded new mechanistic and evolutionary insights concerning the functional landscape of the human genome. Together, these studies are defining a path for pursuit of a more comprehensive characterization of human genome function.
Nature. 447(7146):799-816.