Frequent emergence and functional resurrection of processed pseudogenes in the human and mouse genomes

Hokkaido University, Sapporo, Hokkaido, Japan
Gene (Impact Factor: 2.14). 04/2007; 389(2):196-203. DOI: 10.1016/j.gene.2006.11.007
Source: PubMed


Despite the wide distribution of processed pseudogenes in mammalian genomes, such as those of human and mouse, relatively little is known about their roles in genomic evolution. While gene duplications are recognized as one of the major driving forces in genome evolution, processed pseudogenes, which are retrotransposed copies of mRNAs, have been regarded as junk or selfish DNA for a long time. In order to elucidate the quantitative and qualitative contribution of processed pseudogenes to the mammalian genome evolution, we attempted to detect processed pseudogenes by extensively mapping the mRNAs to both the human and mouse genomes, and then we estimated the rate of their emergence. As a result, we revealed that the rate of pseudogene emergence was about 1-2% per gene per million years, which was as high as the rate (0.9%) of gene duplication in the human genome, although the rate of pseudogene emergence was found to drastically decrease in the hominid lineage. Furthermore, 1% of the processed pseudogenes seemed to be reinvigorated by post-retrotransposition transcription, many of them preserving the intact coding regions. Since the expression patterns of transcribed pseudogenes in various tissues were quite different between human and mouse, their emergence might have led to species-specific evolution. Our results indicate that the generation of processed pseudogenes was not wholly futile but instead has been an indispensable resource, driving dynamic evolution of the mammalian genomes.

34 Reads
  • Source
    • "However, due to differences in retrocopy screening strategies (Baertsch et al. 2008), there is no consensus on the number of retrocopies, even for the human genome. Methods based on mRNA sequence alignments and accurate annotations have identified 7,000–13,000 retrocopies (Sakai et al. 2007; Baertsch et al. 2008; Pei et al. 2012). However, methods based on protein sequence alignments have reported 3,000–6,000 retrocopies (Marques et al. 2005b; Vinckenbosch et al. 2006). "
    [Show abstract] [Hide abstract]
    ABSTRACT: Gene duplication is a key factor contributing to phenotype diversity across and within species. Although the availability of complete genomes has led to the extensive study of genomic duplications, the dynamics and variability of gene duplications mediated by retrotransposition are not well understood. Here, we predict mRNA retrotransposition and use comparative genomics to investigate their origin and variability across primates. Analyzing seven anthropoid primate genomes, we found a similar number of mRNA retrotranspositions (~7,500 retrocopies) in Catarrhini (Old Word Monkeys [OWM], including humans), but a surprising large number of retrocopies (~10,000) in Platyrrhini (New World Monkeys [NWM]), which may be a by-product of higher L1 activity in these genomes. By inferring retrocopy orthology, we dated most of the primate retrocopy origins, and estimated a decrease in the fixation rate in recent primate history, implying a smaller number of species-specific retrocopies. Moreover, using RNA-Seq data, we identified ~3,600 expressed retrocopies. As expected, most of these retrocopies are located near or within known genes, present tissue-specific and even species-specific expression patterns, and no expression correlation to their parental genes. Taken together, our results provide further evidence that mRNA retrotransposition is an active mechanism in primate evolution and suggest that retrocopies may not only introduce great genetic variability between lineages but also create a large reservoir of potentially functional new genomic loci in primate genomes. © The Author(s) 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
    Full-text · Article · Jul 2015 · Genome Biology and Evolution
  • Source
    • "Hence, they became evolutionary vestiges providing considerable information on genome history and evolution. A thorough analysis of the machinery of pseudogenization is relevant for estimating the frequency of duplicate genes in genomes (Sakai et al. 2007). "
    [Show abstract] [Hide abstract]
    ABSTRACT: Pseudogenes are defined as non-functional relatives of genes whose protein-coding abilities are lost and are no longer expressed within cells. They are an outcome of accumulation of mutations within a gene whose end product is not essential for survival. Proper investigation of the procedure of pseudogenization is relevant for estimating occurrence of duplications in genomes. Frankineae houses an interesting group of microorganisms, carving a niche in the microbial world. This study was undertaken with the objective of determining the abundance of pseudogenes, understanding strength of purifying selection, investigating evidence of pseudogene expression, and analysing their molecular nature, their origin, evolution and deterioration patterns amongst domain families. Investigation revealed the occurrence of 956 core pFAM families sharing common characteristics indicating co-evolution. WD40, Rve_3, DDE_Tnp_IS240 and phage integrase core domains are larger families, having more pseudogenes, signifying a probability of harmful foreign genes being disabled within transposable elements. High selective pressure depicted that gene families rapidly duplicating and evolving undoubtedly facilitated creation of a number of pseudogenes in Frankineae. Codon usage analysis between protein-coding genes and pseudogenes indicated a wide degree of variation with respect to different factors. Moreover, the majority of pseudogenes were under the effect of purifying selection. Frankineae pseudogenes were under stronger selective constraints, indicating that they were functional for a very long time and became pseudogenes abruptly. The origin and deterioration of pseudogenes has been attributed to selection and mutational pressure acting upon sequences for adapting to stressed soil environments.
    Full-text · Article · Nov 2013 · Journal of Biosciences
  • Source
    • "These data suggested a nearly simultaneous burst of PP and Alu formation in the genomes of ancestral primates. Similar results have been reported by other groups [53–55]. The peak period of amplification of these 2 distinct retroposons was estimated to be 40–50 million years ago (mya) [50]; moreover, concordant amplification of certain L1 subfamilies with PPs and Alus was observed. "
    [Show abstract] [Hide abstract]
    ABSTRACT: A substantial number of "retrogenes" that are derived from the mRNA of various intron-containing genes have been reported. A class of mammalian retroposons, long interspersed element-1 (LINE1, L1), has been shown to be involved in the reverse transcription of retrogenes (or processed pseudogenes) and non-autonomous short interspersed elements (SINEs). The 3'-end sequences of various SINEs originated from a corresponding LINE. As the 3'-untranslated regions of several LINEs are essential for retroposition, these LINEs presumably require "stringent" recognition of the 3'-end sequence of the RNA template. However, the 3'-ends of mammalian L1s do not exhibit any similarity to SINEs, except for the presence of 3'-poly(A) repeats. Since the 3'-poly(A) repeats of L1 and Alu SINE are critical for their retroposition, L1 probably recognizes the poly(A) repeats, thereby mobilizing not only Alu SINE but also cytosolic mRNA. Many flowering plants only harbor L1-clade LINEs and a significant number of SINEs with poly(A) repeats, but no homology to the LINEs. Moreover, processed pseudogenes have also been found in flowering plants. I propose that the ancestral L1-clade LINE in the common ancestor of green plants may have recognized a specific RNA template, with stringent recognition then becoming relaxed during the course of plant evolution.
    Full-text · Article · Aug 2013
Show more