Identifying concerted evolution and gene conversion in mammalian gene pairs lasting over 100 million years

The Centre for Applied Genomics, The Hospital for Sick Children, Toronto, Ontario, Canada.
BMC Evolutionary Biology (Impact Factor: 3.41). 02/2009; 9:156. DOI: 10.1186/1471-2148-9-156
Source: PubMed

ABSTRACT Concerted evolution occurs in multigene families and is characterized by stretches of homogeneity and higher sequence similarity between paralogues than between orthologues. Here we identify human gene pairs that have undergone concerted evolution, caused by ongoing gene conversion, since at least the human-mouse divergence. Our strategy involved the identification of duplicated genes with greater similarity within a species than between species. These genes were required to be present in multiple mammalian genomes, suggesting duplication early in mammalian divergence. To eliminate genes that have been conserved due to strong purifying selection, our analysis also required at least one intron to have retained high sequence similarity between paralogues.
We identified three human gene pairs undergoing concerted evolution (BMP8A/B, DDX19A/B, and TUBG1/2). Phylogenetic investigations reveal that in each case the duplication appears to have occurred prior to eutherian mammalian radiation, with exactly two paralogues present in all examined species. This indicates that all three gene duplication events were established over 100 million years ago.
The extended duration of concerted evolution in multiple distant lineages suggests that there has been prolonged homogenization of specific segments within these gene pairs. Although we speculate that selection for homogenization could have been utilized in order to maintain crucial homo- or hetero- binding domains, it remains unclear why gene conversion has persisted for such extended periods of time. Through these analyses, our results demonstrate additional examples of a process that plays a definite, although unspecified, role in molecular evolution.

Download full-text


Available from: Andrew R Carson, Jun 26, 2015
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: γ-Tubulin is the key protein for microtubule nucleation. Duplication of the γ-tubulin gene occurred several times during evolution, and in mammals γ-tubulin genes encode proteins which share ∼97% sequence identity. Previous analysis of Tubg1 and Tubg2 knock-out mice has suggested that γ-tubulins are not functionally equivalent. Tubg1 knock-out mice died at the blastocyst stage, whereas Tubg2 knock-out mice developed normally and were fertile. It was proposed that γ-tubulin 1 represents ubiquitous γ-tubulin, while γ-tubulin 2 may have some specific functions and cannot substitute for γ-tubulin 1 deficiency in blastocysts. The molecular basis of the suggested functional difference between γ-tubulins remains unknown. Here we show that exogenous γ-tubulin 2 is targeted to centrosomes and interacts with γ-tubulin complex proteins 2 and 4. Depletion of γ-tubulin 1 by RNAi in U2OS cells causes impaired microtubule nucleation and metaphase arrest. Wild-type phenotype in γ-tubulin 1-depleted cells is restored by expression of exogenous mouse or human γ-tubulin 2. Further, we show at both mRNA and protein levels using RT-qPCR and 2D-PAGE, respectively, that in contrast to Tubg1, the Tubg2 expression is dramatically reduced in mouse blastocysts. This indicates that γ-tubulin 2 cannot rescue γ-tubulin 1 deficiency in knock-out blastocysts, owing to its very low amount. The combined data suggest that γ-tubulin 2 is able to nucleate microtubules and substitute for γ-tubulin 1. We propose that mammalian γ-tubulins are functionally redundant with respect to the nucleation activity.
    PLoS ONE 01/2012; 7(6). DOI:10.1371/annotation/5dd084b1-20e6-4e1f-88e0-dfe05289da08 · 3.53 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The scavenger receptor cysteine rich (SRCR) domain is an ancient and conserved protein domain. CD163 and WC1 molecules are classed together as group B SRCR superfamily members, along with Spalpha, CD5 and CD6, all of which are expressed by immune system cells. There are three known types of CD163 molecules in mammals, CD163A (M130, coded for by CD163), CD163b (M160, coded for by CD163L1) and CD163c-alpha (CD163L1 or SCART), while their nearest relative, WC1, is encoded by a multigene family so far identified in the artiodactyl species of cattle, sheep, and pigs. We annotated the bovine genome and identified genes coding for bovine CD163A and CD163c-alpha but found no evidence for CD163b. Bovine CD163A is widely expressed in immune cells, whereas CD163c-alpha transcripts are enriched in the WC1+ gammadelta T cell population. Phylogenetic analyses of the CD163 family genes and WC1 showed that CD163c-alpha is most closely related to WC1 and that chicken and platypus have WC1 orthologous genes, previously classified as among their CD163 genes. Since it has been shown that WC1 plays an important role in the regulation of gammadelta T cell responses in cattle, which, like chickens, have a high percentage of gammadelta T cells in their peripheral blood, CD163c-alpha may play a similar role, especially in species lacking WC1 genes. Our results suggest that gene duplications resulted in the expansion of CD163c-alpha-like and WC1-like molecules. This expanded repertoire was retained by species known as "gammadelta T cell high", but homologous SRCR molecules were maintained by all mammals.
    BMC Evolutionary Biology 06/2010; 10:181. DOI:10.1186/1471-2148-10-181 · 3.41 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Duplicated genes can indefinately persist in genomes if either both copies retain the original function due to dosage benefit (gene conservation), or one of the copies assumes a novel function (neofunctionalization), or both copies become required to perform the function previously accomplished by a single copy (subfunctionalization), or through a combination of these mechanisms. Different models of duplication retention imply different predictions about substitution rates in the coding portion of paralogs and about asymmetry of these rates. We analyse sequence evolution asymmetry in paralogs present in 12 Drosophila genomes using the nearest non-duplicated orthologous outgroup as a reference. Those paralogs present in D. melanogaster are analysed in conjunction with the asymmetry of expression rate and ubiquity and of segregating non-synonymous polymorphisms in the same paralogs. Paralogs accumulate substitutions, on average, faster than their nearest singleton orthologs. The distribution of paralogs' substitution rate asymmetry is overdispersed relative to that of orthologous clades, containing disproportionally more unusually symmetric and unusually asymmetric clades. We show that paralogs are more asymmetric in: a) clades orthologous to highly constrained singleton genes; b) genes with high expression level; c) genes with ubiquitous expression and d) non-tandem duplications. We further demonstrate that, in each asymmetrically evolving pair of paralogs, the faster evolving member of the pair tends to have lower average expression rate, lower expression uniformity and higher frequency of non-synonymous SNPs than its slower evolving counterpart. Our findings are consistent with the hypothesis that many duplications in Drosophila are retained despite stabilising selection being more relaxed in one of the paralogs than in the other, suggesting a widespread unfinished pseudogenization. This phenomenon is likely to make detection of neo- and subfunctionalization signatures difficult, as these models of duplication retention also predict asymmetries in substitution rates and expression profiles.Reviewers: This article has been reviewed by Dr. Jia Zeng (nominated by Dr. I. King Jordan), Dr. Fyodor Kondrashov and Dr. Yuri Wolf.
    Biology Direct 01/2014; 9(1):2. DOI:10.1186/1745-6150-9-2 · 4.04 Impact Factor