Article

Mitochondrial pseudogenes: Evolution's misplaced witnesses

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Nuclear copies of mitochondrial DNA (mtDNA) have contaminated PCR-based mitochondrial studies of over 64 different animal species. Since the last review of these nuclear mitochondrial pseudogenes (Numts) in animals, Numts have been found in 53 of the species studied. The recent evidence suggests that Numts are not equally abundant in all species, for example they are more common in plants than in animals, and also more numerous in humans than in Drosophila. Methods for avoiding Numts have now been tested, and several recent studies demonstrate the potential utility of Numt DNA sequences in evolutionary studies. As relics of ancient mtDNA, these pseudogenes can be used to infer ancestral states or root mitochondrial phylogenies. Where they are numerous and selectively unconstrained, Numts are ideal for the study of spontaneous mutation in nuclear genomes.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... The nuclear genomes of most animal species contain segments of the mitogenome [1] captured during the repair of double-strand breaks associated with meiotic recombination [2,3]. Many of these NUMTs (nuclear DNA sequences of mitochondrial origin) are short, but some include much of the mitochondrial genome [4]. ...
... NUMTs are typically recognized via screens for indels or premature stop codons (IPSCs) [1]. To determine if the NUMTs identified in our analysis were diagnosable, we first searched for indels. ...
... The presence of NUMTs in insect genomes has been known for 40 years [63], but details on their abundance and attributes have only slowly gained clarity. Early studies revealed that NUMTs range widely in size [4], that NUMT counts vary among taxa [8], and that sequence change slows after nuclear integration [1,64]. Because of the latter property, NUMTs can illuminate deep time events [65,66]. ...
Article
Full-text available
The nuclear genomes of most animal species include NUMTs, segments of the mitogenome incorporated into their chromosomes. Although NUMT counts are known to vary greatly among species, there has been no comprehensive study of their frequency/attributes in the most diverse group of terrestrial organisms, insects. This study examines NUMTs derived from a 658 bp 5' segment of the cytochrome c oxidase I (COI) gene, the barcode region for the animal kingdom. This assessment is important because unrecognized NUMTs can elevate estimates of species richness obtained through DNA barcoding and derived approaches (eDNA, metabarcoding). This investigation detected nearly 10,000 COI NUMTs ≥ 100 bp in the genomes of 1,002 insect species (range = 0-443). Variation in nuclear genome size explained 56% of the mitogenome-wide variation in NUMT counts. Although insect orders with the largest genome sizes possessed the highest NUMT counts, there was considerable variation among their component lineages. Two thirds of COI NUMTs possessed an IPSC (indel and/or premature stop codon) allowing their recognition and exclusion from downstream analyses. The remainder can elevate species richness as they showed 10.1% mean divergence from their mitochondrial homologue. The extent of exposure to "ghost species" is strongly impacted by the target amplicon's length. NUMTs can raise apparent species richness by up to 22% when a 658 bp COI amplicon is examined versus a doubling of apparent richness when 150 bp amplicons are targeted. Given these impacts, metabarcoding and eDNA studies should target the longest possible amplicons while also avoiding use of 12S/16S rDNA as they triple NUMT exposure because IPSC screens cannot be employed.
... Mitochondrial DNA is often used for phylogenetic studies that investigate matrilineal inheritance patterns (Chaitanya et al. 2014), inter-and intraspecific divergences (Cronin et al. 1991;Gill et al. 1993;Bowers et al. 1994), and for studies that use samples with low DNA copy numbers (Hofreiter et al. 2001b;Merheb et al. 2019). However, the presence of nuclear mitochondrial (numt) pseudogenes (designated as Numt by Lopez et al. 1994) may hinder the identification of true cytoplasmic mitochondrial (cymt) DNA sequences and the reliability of mtDNA for phylogenetic and population genetic comparisons (Bensasson et al. 2001;Smart et al. 2019). Numts arise when mitochondrial DNA is incorporated into the nuclear genome during chromosomal double-strand break repair by nonhomologous recombination Schmidt 1995, 1996;Bensasson et al. 2001). ...
... However, the presence of nuclear mitochondrial (numt) pseudogenes (designated as Numt by Lopez et al. 1994) may hinder the identification of true cytoplasmic mitochondrial (cymt) DNA sequences and the reliability of mtDNA for phylogenetic and population genetic comparisons (Bensasson et al. 2001;Smart et al. 2019). Numts arise when mitochondrial DNA is incorporated into the nuclear genome during chromosomal double-strand break repair by nonhomologous recombination Schmidt 1995, 1996;Bensasson et al. 2001). Organellar DNA fragments are found in the nuclear genomes of many eukaryotes (Bensasson et al. 2001;Gaziev and Shaikhaev 2010), mostly in noncoding intergenic regions and introns (Bensasson et al. 2001;Gaziev and Shaikhaev 2010;Smart et al. 2019). ...
... Numts arise when mitochondrial DNA is incorporated into the nuclear genome during chromosomal double-strand break repair by nonhomologous recombination Schmidt 1995, 1996;Bensasson et al. 2001). Organellar DNA fragments are found in the nuclear genomes of many eukaryotes (Bensasson et al. 2001;Gaziev and Shaikhaev 2010), mostly in noncoding intergenic regions and introns (Bensasson et al. 2001;Gaziev and Shaikhaev 2010;Smart et al. 2019). Numts may vary in size and sequence depending on the mitochondrial DNA that is integrated into the nuclear genome during double-stranded break repair. ...
Article
Nuclear mitochondrial pseudogenes (numts) may hinder the reconstruction of mtDNA genomes and affect the reliability of mtDNA datasets for phylogenetic and population genetic comparisons. Here, we present the program Numt Parser, which allows for the identification of DNA sequences that likely originate from numt pseudogene DNA. Sequencing reads are classified as originating from either numt or true cytoplasmic mitochondrial (cymt) DNA by direct comparison against cymt and numt reference sequences. Classified reads can then be parsed into cymt or numt datasets. We tested this program using whole genome shotgun-sequenced data from two ancient Cape lions (Panthera leo), because mtDNA is often the marker of choice for ancient DNA studies and the genus Panthera is known to have numt pseudogenes. Numt Parser decreased sequence disagreements that were likely due to numt pseudogene contamination and equalized read coverage across the mitogenome by removing reads that likely originated from numts. We compared the efficacy of Numt Parser to two other bioinformatic approaches that can be used to account for numt contamination. We found that Numt Parser outperformed approaches that rely only on read alignment or Basic Local Alignment Search Tool (BLAST) properties, and was effective at identifying sequences that likely originated from numts while having minimal impacts on the recovery of cymt reads. Numt Parser therefore improves the reconstruction of true mitogenomes, allowing for more accurate and robust biological inferences.
... The new caracal sequences discovered in this study were deposited in GenBank with accession numbers OM417235-OM417242 (Table 1). Several observations indicate that the Cyt b sequences produced are mitochondrial in origin and not copies of mtDNA integrated into the nuclear genome (Numts) (Lopez et al. 1994;Arctander 1995;Zhang and Hewitt 1996;Bensasson et al. 2001). First, PCRs consistently yielded single products of the expected size. ...
... Second, independent replicate PCRs produced identical sequences. Third, the sequences were unambiguous and did not contain indels or stop codons (Smith et al. 1992;Arctander 1995;Collura and Stewart 1995;DeWoody et al. 1999;Mirol et al. 2000;Mundy et al. 2000;Bensasson et al. 2001;Lü et al. 2002;Rodríguez et al. 2007;DeWoody 2007, 2008;Dubey et al. 2009;Baldo et al. 2011;Pérez et al. 2017;Torres et al. 2022). Finally, as typical of Cyt b (Kocher et al. 1989;Irwin et al. 1991), they showed a bias toward thymine at second codon positions (39.9% of thymines compared with 27.4% and 18.6% at first and third codon positions, respectively), and a strong bias against guanine at third codon positions (4.4% of guanines compared with 23.7% and 14% at first and second codon positions, respectively) and against second codon position substitutions (relative rates of substitution at first, second, and third positions within codons in the ratio 8 to 1 to 30, which is in good agreement with estimates for Cyt b in mammals (Irwin et al. 1991;Mundy et al. 2000)). ...
... Another scenario, for which there is growing evidence for its potential importance (Ottenburghs 2020), is hybridization with extinct (or unsampled) lineages, which in the case of mtDNA is reflected in highly divergent haplotypes, derived from capture by introgression of extinct (or unsampled) variants ('ghost introgression'; e.g., Rakotoarivelo et al. 2019;Zhang et al. 2019). There is always the possibility, however, that highly divergent haplotypes are in fact Numts, which are sometimes also derived from past hybridization and introgression with now-extinct mitochondrial lineages (Sunnucks and Hales 1996;Bensasson et al. 2001;Baldo et al. 2011;Pérez et al. 2017). In the case of the Somalian haplotype identified in this study, its nucleotide composition and substitution rate biases among codon positions are congruent with those of the other caracal haplotypes. ...
Article
Full-text available
The caracal (Caracal caracal) is a medium-sized felid with a wide distribution in Africa and extending through Southwest Asia to India. It remains essentially unstudied in its genetic diversity and structure and phylogeographic history throughout its geographic range. In this study, we analysed mitochondrial Cytochrome b variation in the Iranian caracal and found considerably low diversity and a lack of geographic structure. Mitochondrial diversity patterns are compatible with a recent demographic bottleneck or a founder event by a single or few lineages at the origin of the extant diversity. We also analysed sequences from other areas along the caracal’s range for the first range-wide phylogeographic comparisons in the species. The haplotypes found in the Arabian Peninsula are identical or closely related to the haplotype predominant in Iran, which raises the hypothesis that the historical demographic pattern inferred for Iran may be extended to Southwest Asia. A remarkable result was to find in a Somalian individual a haplotype very divergent from all other sequences analysed. Several different analyses, including sequencing of a fragment of the ATP8/6 genes located at least about 6.4 kb from Cytochrome b in the mitochondrial DNA molecule, suggest that this lineage in the Somalian sample is mitochondrial and not a non-functional nuclear pseudogene. Its genetic divergence from haplotypes in Southwest Asia and South Africa was estimated to be 8.2 ± 1.4% and 7.5 ± 1.3%, respectively. In comparison, the genetic divergence between the more geographically distant Southwest Asian and South African haplotypes was 2.0 ± 0.6%. A critical examination of the results of the different molecular clock methods used points to a divergence time between the Somalian haplotype and the Southwest Asian + South African clade in the late Early Pleistocene at ~ 1–1.3 Ma, and between the Southwest Asian and South African lineages at ~ 300–500 ka, i.e., midway through the Middle Pleistocene.
... To check whether NUMTs had been co-amplified, a phylogenetic analysis of all obtained haplotypes was conducted. When phylogenetic relationships within a taxon are well known, pseudogenes can often be detected by the atypical length of their branches and irregular topology [3,4,9,38], since the mutation rate in NUMTs is about ten times lower compared to mtDNA [5,26]. D-loop sequences of E. tancrei, steppe lemming, Lagurus lagurus, yellow steppe lemmings, Eolagurus luteus, and European water vole, Arvicola amphibius, were included in the analysis; the large-eared vole, Alticola macrotis and the bank vole, Myodes glareolus were used to root the tree. ...
... Once in the cytoplasm, mitochondrial DNA can, under certain circumstances, integrate into the nuclear genome. Although it is unlikely that NUMTs are functional per se due to differences between the nuclear and mitochondrial genetic codes [5], it can be assumed that in rare cases they may perform a functional role as regulatory elements or through the creation of new exons. ...
Article
Full-text available
Background: Ellobius talpinus is a subterranean rodent representing an attractive model in population ecology studies due to its highly special lifestyle and sociality. In such studies, mitochondrial DNA (mtDNA) is widely used. However, if nuclear copies of mtDNA, aka NUMTs, are present, they may co-amplify with the target mtDNA fragment, generating misleading results. The aim of this study was to determine whether NUMTs are present in E. talpinus. Methods and results: PCR amplification of the putative mtDNA CytB-D-loop fragment using ‘universal’ primers from 56 E. talpinus samples produced multiple double peaks in 90% of the sequencing chromatograms. To reveal NUMTs, molecular cloning and sequencing of PCR products of three specimens was conducted, followed by phylogenetic analysis. The pseudogene nature of three out of the seven detected haplotypes was confirmed by their basal positions in relation to other Ellobius haplotypes in the phylogenetic tree. Additionally, ‘haplotype B’ was basal in relation to other E. talpinus haplotypes and found present in very distant sampling sites. BLASTN search revealed 195 NUMTs in the E. talpinus nuclear genome, including fragments of all four PCR amplified pseudogenes. Although the majority of the NUMTs studied were short, the entire mtDNA had copies in the nuclear genome. The most numerous NUMTs were found for rrnL, COXI, and D-loop. Conclusions: Numerous NUMTs are present in E. talpinus and can be difficult to discriminate against mtDNA sequences. Thus, in future population or phylogenetic studies in E. talpinus, the possibility of cryptic NUMTs amplification should always be taken into account.
... 21 22 NUMTs have been investigated across a broad spectrum of vertebrates, spanning fish, amphibians, 23 reptiles, birds and mammals ( basis underlying these variations and NUMT evolutionary trajectories in a mammal-wide context are 28 poorly understood. It has been suggested that NUMTs, the molecular fossils of ancestral mtDNA, can 29 be potential genetic markers to infer phylogenetic relationships (Bensasson, et al. 2001). However, the 30 application is only limited to a few studies that focused on groups of species in narrow phylogenetic 31 brackets, such as Primates (Hazkani-Covo 2009; Hazkani-Covo and Graur 2007), Passeriformes 32 (Liang, et al. 2018) and Chiroptera (Puechmaille, et al. 2011). ...
... ancestral to the root of mammals (Fig. 5a). These results are not surprising because, as relics of 31 ancient mtDNA, NUMTs evolve under limited selective constraints (Bensasson et al. 2001). 32 ...
Article
Full-text available
The escape of DNA from mitochondria into the nuclear genome (nuclear mitochondrial DNA, NUMT) is an ongoing process. Although pervasively observed in eukaryotic genomes, their evolutionary trajectories in a mammal-wide context are poorly understood. The main challenge lies in the orthology assignment of NUMTs across species due to their fast evolution and chromosomal rearrangements over the past 200 million years. To address this issue, we systematically investigated the characteristics of NUMT insertions in 45 mammalian genomes, and established a novel, synteny-based method to accurately predict orthologous NUMTs and ascertain their evolution across mammals. With a series of comparative analyses across taxa, we revealed that NUMTs may originate from non-random regions in mtDNA, are likely found in transposon-rich and intergenic regions, and unlikely code for functional proteins. Using our synteny-based approach, we leveraged 630 pairwise comparisons of genome-wide microsynteny and predicted the NUMT orthology relationships across 36 mammals. With the phylogenetic patterns of NUMT presence-and-absence across taxa, we constructed the ancestral state of NUMTs given the mammal tree using a coalescent method. We found support on the ancestral node of Fereuungulata within Laurasiatheria, whose subordinal relationships are still controversial. This study broadens our knowledge on NUMT insertion and evolution in mammalian genomes and highlights the merit of NUMTs as alternative genetic markers in phylogenetic inference.
... Nevertheless, other contaminants, like nuclear mitochondrial pseudogenes (NUMTs) or heteroplasmy, are harder to spot and remove. NUMTs are nonfunctional fragments of mitochondrial DNA incorporated in the nucleus; they are widespread and have been reported in all the major clades of eukaryotes (Bensasson et al., 2001;Jordal & Kambestad, 2013;Moulton et al., 2010;Song et al., 2014). NUMTs have been recognized as a critical issue in phylogenetics (Bensasson et al.,, 2001), but they are particularly harmful to DNA barcoding and metabarcoding studies (Graham et al., 2021;Song et al., 2008;Sorenson & Quinn, 1998). ...
... NUMTs are nonfunctional fragments of mitochondrial DNA incorporated in the nucleus; they are widespread and have been reported in all the major clades of eukaryotes (Bensasson et al., 2001;Jordal & Kambestad, 2013;Moulton et al., 2010;Song et al., 2014). NUMTs have been recognized as a critical issue in phylogenetics (Bensasson et al.,, 2001), but they are particularly harmful to DNA barcoding and metabarcoding studies (Graham et al., 2021;Song et al., 2008;Sorenson & Quinn, 1998). ...
Article
Full-text available
Aim: Desert springs or oases are the only permanent mesic environments in highly water-limited arid regions. Oases have immense cultural, evolutionary and ecological importance for people and a high number of endemic and relic species. Nevertheless, they are also highly vulnerable ecosystems, with invasive species, overexploitation and climate change being the primary threats. We used the arthropod communities' spatiotemporal diversity and distribution patterns as a proxy to understand biodiversity dynamics in two geographically close but ecologically contrasting and highly threatened ecosystems: deserts and oases. Location: Baja California Peninsula, Mexico. Methods: Arthropod communities at five oases and surrounding desert scrub areas were sampled in two seasons. Using DNA metabarcoding and traditional taxonomic surveys, we tried to identify what biotic and abiotic characteristics of the habitat are important drivers of arthropod diversity and how these characteristics can change across spatial and temporal scales. Results: Over 6200 individuals representing 23 orders were collected. In oasis samples, the community composition fluctuated more in space (i.e. among sites) than in time (i.e. seasons). Thus, seasonal changes did not affect oasis community diversity and composition, but the dissimilarity among sites increased with geographic distance. Moreover, anthropic activities negatively correlated with arthropod diversity in oases. On the other hand, the season, geography (e.g. latitude) and biotic characteristics of the habitat (e.g. sampled scrub species) significantly affected the diversity and composition of the desert arthropod communities. Main Conclusions: Neutral dynamics (e.g. historical climatic events, dispersal limitation and spatial component) and human impact significantly influenced the biodiversity patterns of each oasis. In contrast, the habitat's seasonal variation and biotic.
... However, the few studies trying to calibrate a molecular clock for terrestrial isopods do not support this (Ketmaier et al., 2003;Poulakakis & Sfenthourakis, 2008;Wysocka et al., 2008). Lastly, the presence of nuclear copies of the mtDNA genes (NUMTs) can yield deceptive results including extremely divergent lineages (Bensasson et al., 2001). However, in the case of coding genes, these divergent NUMTs should be plagued with stop codons and would be easy to detect. ...
Article
Full-text available
The terrestrial isopod genus Ligidium includes 58 species from Europe, Asia, and North America. In Eastern North America four species are recognized: L. floridanum and L. mucronatum , known just from their type localities in Florida and Louisiana respectively, L. blueridgensis , endemic to the southern Appalachians, and L. elrodii , widespread from Georgia to Ontario. The genus shows a marked morphological conservatism, and species are differentiated mostly by small morphological differences; it is not always easy to determine if such variability represents inter- or intraspecific variation. Here, we explore the diversity of Ligidium from the southern Appalachian Mountains, exploring the congruence of morphologically defined groups with multilocus phylogenetic reconstructions and molecular species delimitation methods. We have studied a total of 130 specimens from 37 localities, mostly from the southern Appalachians, and analysed mtDNA (Cox1) and nuclear (28S, NaK) sequences. Morphologically, we recognized eight morphotypes, most of them assignable to current concepts of L. elrodii and L. blueridgensis . Phylogenetic analyses supported the evolutionary independence of all morphotypes, and suggest the existence of 8–9 species, including limited cryptic diversity. Single-locus delimitation analyses based on mtDNA data suggest the existence of a much higher number of species than the multilocus analyses. The estimated age of the ancestors of sampled lineages indicates a long presence of the genus in eastern North America and old speciation events through the Miocene. Our results indicate a higher diversity than previously thought among the Ligidium populations present in the southern Appalachian Mountains, with several species to be described.
... Nevertheless, there seems to be a weakness of the applied COI marker in the quality of Hymenoptera species identifications (Marquina et al., 2019). Further, while the selected primers are generally well suited to detect a high share of all the different species groups present in a bulk sample (Brandon-Mong et al., 2015), they potentially have a problem with unintended amplification, for example, of nuclear mitochondrial pseudogenes (numts) (Bensasson et al., 2001), which appear exceptionally often in honey bees (Pamilo et al., 2007). Additionally, among others, the frequent presence of numts in association with COI markers was also identified as an issue for species identification of grasshoppers (Hawlitschek et al., 2017;Song et al., 2008), of which 15 out of 39 identified species were not likely to occur in the study area, thus showing a considerably higher share of misidentifications than the considered pollinator groups (Table S3). ...
Article
Full-text available
While transect walks have long been the preferred monitoring method for many flying insect taxa, malaise traps combined with DNA metabarcoding have gained growing prominence. However, it remains unclear whether both methods reveal comparable species richness and the same ecological drivers along environmental gradients. We selected three groups of pollinators (wild bees, hoverflies and butterflies) and one group of herbivores (grasshoppers) as functionally important and conservation‐relevant model groups, comparing results of both methods along an elevational gradient in the German Alps. Across the study region, both methods detected a similarly high species richness of pollinators with ~50% overlap of species pools, but transect walks revealed more species per site, especially in higher elevations and under low temperatures. Body size spectra differed between methods, with on average more large butterfly and more small bee species in transect walks. Nevertheless, temperature and flower richness were consistent drivers of pollinator richness, independent of the sampling method. Grasshopper richness from transect walks was considerably higher than from malaise traps. Both methods identified temperature and only malaise traps also identified management as drivers of grasshopper richness. We conclude that malaise traps are principally suitable substitutes for the more time‐consuming pollinator transect walks. However, the effectiveness of these passive traps is more susceptible to changes in sampling temperature, and in some pollinator groups, body size classes are presented differently, which is important to consider during analyses. For grasshoppers, transect walks appear to be more suitable to assess species richness, as considerably more species can be monitored.
... The number of reports on nuclear mitochondrial pseudogenes (NUMTs) and mitochondrial heteroplasmy in eukaryotes is drastically increasing (Lopez et al., 1994;Sorenson & Quinn, 1998;Bensasson et al., 2001;Kmiec et al., 2006;Pamilo et al., 2007;Song et al., 2008;Hazkani-Covo et al., 2010;Rodríguez-Pena et al., 2020;Tan et al., 2020;Wei et al., 2022). Co-amplification of these noises with genuin mtDNA can not only complicate the interpretation of mtDNA data, but also have a significant negative impact on the quality of the chromatogram obtained by direct nucleotide sequencing (Song et al., 2008;Buhay, 2009). ...
Article
Full-text available
The Japanese spiny lobster Panulirus japonicus has been reported to harbor a numbers of nuclear mitochondrial pseudogenes (NUMTs) and heteroplasmy. However, distinguishing phylogenetically young NUMTs, heteroplasmy, and PCR-cloning artefacts may be challenging. In addition, greater degradation for mtDNA than nuclear DNA in elderly tissue specimens may promote amplification of NUMTs. In this study, we performed clone library-based nucleotide sequence analysis of the partial mtDNA COI gene using genomic DNA and cDNA obtained from fresh tissues of the Japanese spiny lobster and genomic DNA obtained from three crustacean and three fish species. Minor nucleotide substitutions between clones in an individual were ubiquitously observed in all species examined including the lobster cDNA, suggesting that most of these were artefacts. Rarely, a few clones were most likely to have originated from heteroplasmic copies, as they had skewed nucleotide substitutions at the third codon. The Japanese spiny lobster is more likely than others to detect NUMTs, while the detection of NUMTs may be somewhat suppressed using genomic DNA obtained from fresh tissue.
... Because no reading frame was broken in the extra haplotype, it is very unlikely that this haplotype resulted from nuclear mitochondrial DNA segments inserted into the nuclear genome (e.g. [31,32]). However, to corroborate if this extra haplotype is mitochondrial, it is necessary to determine the whole mitochondrial genome sequence for each of the two mitochondria within a single individual. ...
Article
Full-text available
Heteroplasmy, the presence of multiple mitochondrial DNA (mtDNA) haplotypes within cells of an individual, is caused by mutation or paternal leakage. However, heteroplasmy is usually resolved to homoplasmy within a few generations because of germ-line bottlenecks; therefore, instances of heteroplasmy are limited in nature. Here, we report heteroplasmy in the ricefish species Oryzias matanensis, endemic to Lake Matano, an ancient lake in Sulawesi Island, in which one individual was known to have many heterozygous sites in the mitochondrial NADH dehydrogenase subunit 2 (ND2) gene. In this study, we cloned the ND2 gene for some additional individuals with heterozygous sites and demonstrated that they are truly heteroplasmic. Phylogenetic analysis revealed that the extra haplotype within the heteroplasmic O. matanensis individuals clustered with haplotypes of O. marmoratus, a congeneric species inhabiting adjacent lakes. This indicated that the heteroplasmy originated from paternal leakage due to interspecific hybridization. The extra haplotype was unique and contained two non-synonymous substitutions. These findings demonstrate that this hybridization-driven heteroplasmy was maintained across generations for a long time to the extent that the extra mitochondria evolved within the new host.
... Enrichment enables a higher proportion of sequencing reads to be directly relevant to the targeted organelle genomes, thus optimizing the sequencing efforts and ensuring more comprehensive coverage without the need for prohibitively deep sequencing of the entire sample. This approach is especially crucial in large-scale or multi-sample studies, where the cost savings can be substantial [42]. ...
Article
Full-text available
In this comprehensive review, we explore the significant role that nanopore sequencing technology plays in the study of plant organellar genomes, particularly mitochondrial and chloro-plast DNA. To date, the application of nanopore sequencing has led to the successful sequencing of over 100 plant mitochondrial genomes and around 80 chloroplast genomes. These figures not only demonstrate the technology's robustness but also mark a substantial advancement in the field, highlighting its efficacy in decoding the complex and dynamic nature of these genomes. Nanopore se-quencing, known for its long-read capabilities, significantly surpasses traditional sequencing techniques , especially in addressing challenges like structural complexity and sequence repetitiveness in organellar DNA. This review delves into the nuances of nanopore sequencing, elaborating on its benefits compared to conventional methods and the groundbreaking applications it has fostered in plant organellar genomics. While its transformative impact is clear, the technology's limitations, including error rates and computational requirements, are discussed, alongside potential solutions and prospects for technological refinement.
... Second, using genomic DNA (gDNA) severely increases the chances of contaminating downstream analyses with nuclear mitochondrial pseudogenes (NUMTs) (Machida et al., 2009(Machida et al., , 2021Schultz & Hebert, 2022). These NUMTs have been reported in many animal phyla (Bensasson et al., 2001;Hazkani-Covo et al., 2010;Ožana et al., 2022;Song et al., 2008;Williams & Knowlton, 2001), and if amplified and sequenced, species diversity assessments might be spuriously inflated (Machida & Lin, 2017;Schultz & Hebert, 2022;Song et al., 2008). As an example, comparison of the operational taxonomic units (OTUs), those estimated from zooplankton community gDNA and complementary DNA (cDNA), which is reverse transcribed from mRNA, indicates roughly twice the richness inflation happening in gDNA analyses (Machida et al., 2021). ...
Article
Full-text available
PCR‐based high‐throughput sequencing has permitted comprehensive resolution analyses of zooplankton diversity dynamics. However, significant methodological issues still surround analyses of complex bulk community samples, not least as in prevailing PCR‐based approaches. Marine drifting animals—zooplankton—play essential ecological roles in the pelagic ecosystem, transferring energy and elements to higher trophic levels, such as fishes, cetaceans and others. In the present study, we collected 48 size‐fractionated zooplankton samples in the vicinity of a coral reef island with environmental gradients. To investigate the spatiotemporal dynamics of zooplankton diversity patterns and the effect of PCR amplification biases across these complex communities, we first took metatranscriptomics approach. Comprehensive computational analyses revealed a clear pattern of higher/lower homogeneity in smaller/larger zooplankton compositions across samples respectively. Our study thus suggests changes in the role of dispersal across the sizes. Next, we applied in silico PCR to the metatranscriptomics datasets, in order to estimate the extent of PCR amplification bias. Irrespective of stringency criteria, we observed clear separations of size fraction sample clusters in both metatranscriptomics and in silico datasets. In contrast, the pattern—smaller‐fractioned communities had higher compositional homogeneity than larger ones—was observed in the metatranscriptomics data but not in the in silico datasets. To investigate this discrepancy further, we analysed the mismatches of widely used mitochondrial CO1 primers and identified priming site mismatches likely driving PCR‐based biases. Our results suggest the use of metatranscriptomics or, although less ideal, redesigning the CO1 primers is necessary to circumvent these issues.
... The new sequences were edited to remove unreliably resolved bases at both ends and aligned using the CLUSTAL X 2.0 program (Larkin et al., 2007). After checking for the possible interference of mitochondrial pseudogenes following the recommendation of Zhang & Hewitt (1996) and Bensasson et al., (2001), the alignment was collapsed into haplotypes using DnaSP 5.10 (Librado & Rozas, 2009). Using the unique haplotypes as queries, the standard nucleotide BLAST searches were performed in the NCBI database (https://blast.ncbi.nlm.nih.gov) to retrieve the homologous sequences. ...
Article
There is increasing evidence that demographic history and phylogeographic consequences of past climate changes unfolded locally and varied from region to region. Despite the high rodent species diversity and endemism in low latitude Asia, how they have responded to the past climatic fluctuations remains unexplored. In the present study, we trapped 253 murine individuals and sequenced their mitochondrial COI gene sequence. A total of ten species belonging to five genera were recognized through phylogenetic analyses. The results of divergence time estimation showed that the most common ancestors for all recognized species occurred in the Pleistocene. Signals of demographic expansion were detected in six species by at least one test and the events of sharp population size increase occurred asynchronously among species. The demographic expansion during the glaciation periods was corroborated by the expanded suitable distributional areas predicted by ecological niche modelling. The diversified demographic histories of rat communities in low-latitude Asia suggested that species might have responded to past regional environmental changes in an individualistic way. ADDITIONAL KEYWORDS: demographic history-divergence dating-ecological niche modelling-low-latitude Asia-Murinae.
... We have shown that the NUMT content, even within closely related apicomplexan species, varies significantly. These data are largely consistent with previous observations where a large variation in NUMT content has been described among Drosophila melanogaster, Anopheles gambiae, and A. mellifera, and even among mammals like the human, mouse, and rat (16,17,41,(56)(57)(58). This variation may be explained by two major forces: differences in the frequency at which species acquire and retain DNA from their mitochondria and the differential rates of NUMT removal (57,59). ...
Article
Toxoplasma gondii is a zoonotic protist pathogen that infects up to one third of the human population. This apicomplexan parasite contains three genome sequences: nuclear (65 Mb); plastid organellar, ptDNA (35 kb); and mitochondrial organellar, mtDNA (5.9 kb of non-repetitive sequence). We find that the nuclear genome contains a significant amount of NUMTs (nuclear integrants of mitochondrial DNA) and NUPTs (nuclear integrants of plastid DNA) that are continuously acquired and represent a significant source of intraspecific genetic variation. NUOT (nuclear DNA of organellar origin) accretion has generated 1.6% of the extant T. gondii ME49 nuclear genome—the highest fraction ever reported in any organism. NUOTs are primarily found in organisms that retain the non-homologous end-joining repair pathway. Significant movement of organellar DNA was experimentally captured via amplicon sequencing of a CRISPR-induced double-strand break in non-homologous end-joining repair competent, but not ku80 mutant, Toxoplasma parasites. Comparisons with Neospora caninum , a species that diverged from Toxoplasma ~28 mya, revealed that the movement and fixation of five NUMTs predates the split of the two genera. This unexpected level of NUMT conservation suggests evolutionary constraint for cellular function. Most NUMT insertions reside within (60%) or nearby genes (23% within 1.5 kb), and reporter assays indicate that some NUMTs have the ability to function as cis -regulatory elements modulating gene expression. Together, these findings portray a role for organellar sequence insertion in dynamically shaping the genomic architecture and likely contributing to adaptation and phenotypic changes in this important human pathogen.
... However, there are other two different possibilities that could be explained this situation. Firstly, the formation of nuclear mitochondrial pseudogenes (NUMTs) [100,101], which are described as the transposition of mitochondrial DNA into the nuclear genome that can retain close homology to the original mitochondrial genes [102]. There are also references to the presence of COI-like sequences in many crustaceans, including crayfish [103,104]. ...
Article
Full-text available
European crayfish species are a clear example of the drastic decline that freshwater species are experiencing. In particular, the native species of the Iberian Peninsula, the white clawed-crayfish (WCC) Austropotamobius pallipes , is listed as “endangered” by the IUCN and included in Annex II of the EU Habitat Directive and requires especially attention. Currently, implemented conservation management strategies require a better understanding of the genetic diversity and phylogeographic patterns, as well as of its evolutionary history. For this purpose, we have generated the largest datasets of two informative ribosomal mitochondrial DNA regions, i . e ., cytochrome oxidase subunit I and 16S , from selected populations of the WCC covering its geographical distribution. These datasets allowed us to analyze in detail the (i) genetic diversity and structure of WCC populations, and (ii) divergence times for Iberian populations by testing three evolutionary scenarios with different mtDNA substitution rates (low, intermediate, and high rates). The results indicate high levels of haplotype diversity and a complex geographical structure for WCC in the Iberian Peninsula. The diversity found includes new unique haplotypes from the Iberian Peninsula and reveals that most of the WCC genetic variability is concentrated in the northern and central-eastern regions. Despite the fact that molecular dating analyses provided divergence times that were not statistically supported, the proposed scenarios were congruent with previous studies, which related the origin of these populations with paleogeographic events during the Pleistocene, which suggests an Iberian origin for these WCC. All results generated in this study, indicate that the alternative hypothesis of an introduced origin of the Iberian WCC is highly improbable. The result of this study, therefore, has allowed us to better understand of the genetic diversity, structure patterns, and evolutionary history of the WCC in the Iberian Peninsula, which is crucial for the management and conservation needs of this endangered species.
... morroensis (Fig. 16). Numts have been reported in more than 60 animals and plant species and are most commonly described as fragments of less than 600 bp (Zhang and Hewitt 1996;Herrnstadt et al. 1999;Bensasson et al. 2001;Kim et al. 2006). Ancient samples are prone to numts (Willerslev and Cooper 2005). ...
Thesis
Full-text available
Heermann’s kangaroo rats (Dipodomys heermanni; Rodentia: Heteromyidae) are endemic to California and primarily found in the dry, gravelly grassland and open chaparral habitats of the San Joaquin Valley. Current taxonomy (based on morphology and habitat use) recognizes nine subspecies within this kangaroo rat species. Management practices of D. heermanni primarily are based on this classification, but this taxonomy may not accurately reflect unique lineages in need of conservation. Using molecular and morphological data, I performed a phylogeographic assessment of D. heermanni examining relationships within and among the nine subspecies across the full geographic range of the species. Phylogenetic and network analyses of mitochondrial data from over 90 museum specimens (representing all nine subspecies distributed across the range of the species) revealed no substantial genetic differentiation within D. heermanni. Similarly, a geometric morphometric analysis of the cranium of over 200 adult D. heermanni museum specimens (again representing all subspecies across the geographic distribution of species) resulted in no apparent morphological clustering across geography. My analyses indicate that recognition of all nine subspecies is likely unwarranted and that conservation and management practices of D. heermanni are in need of revision.
... Additionally, there is a high risk that the resulting tree might not accurately represent the species tree as it is derived from a small sample of nuclear and mitochondrial genomes. Moreover, mitochondrial-based phylogeny might be further confounded by factors such as hybridization, mitochondrial introgression, heteroplasmy and presence of nuclear mitochondrial pseudogenes (Numts) (Bensasson et al., 2001;Bonnet et al., 2017;Dubey et al., 2009;Richly and Leister, 2004). ...
Article
Full-text available
Gerbillus is one of the most speciose genera among rodents, with ca. 51 recognized species. Previous attempts to reconstruct the evolutionary history of Gerbillus mainly relied on the mitochondrial cyt-b marker as a source of phylogenetic information. In this study, we utilize RAD-seq genomic data from 37 specimens representing 11 species to reconstruct the phylogenetic tree for Gerbillus, applying concatenation and coalescence methods. We identified four highly supported clades corresponding to the traditionally recognized subgenera: Dipodillus, Gerbillus, Hendecapleura and Monodia. Only two uncertain branches were detected in the resulting trees, with one leading to diversification of the main lineages in the genus, recognized by quartet sampling analysis as uncertain due to possible introgression. We also examined species boundaries for four pairs of sister taxa, including potentially new species from Morocco, using SNAPP. The results strongly supported a speciation model in which all taxa are treated as separate species. The dating analyses confirmed the Plio-Pleistocene diversification of the genus, with the uncertain branch coinciding with the beginning of aridification of the Sahara at the the Plio-Pleistocene boundary. This study aligns well with the earlier analyses based on the cyt-b marker, reaffirming its suitability as an adequate marker for estimating genetic diversity in Gerbillus.
... There have been many reports of nuclear mtDNA pseudogenes (NUMTs) in the genomes of a diverse range of eukaryotic species (Bensasson et al. 2001, Song et al. 2008, Dubey et al. 2009). To confirm that amplified sequences were of true mitochondrial origin, two different PCRs, producing overlapping fragments, were performed on all individuals of the newly described species (Talpa hakkariensis sp. ...
Article
Full-text available
Subterranean life is associated with strong adaptive constraints, leading to the frequent occurrence of morphologically cryptic lineages. This is true of most small mammals, including moles (Eulipotyphla: Talpidae), where a number of species have been recognized recently, particularly following the application of molecular genetics. Here, we use mitochondrial and nuclear DNA sequence data and geometric morphometrics to explore the systematics and evolution of some of the least-known Western Palaearctic moles: the Talpa davidiana group of Eastern Anatolia/Iran. We show that T. davidiana includes four taxa, two of which we describe herein: T. hakkariensis sp. nov., T. davidiana davidiana, T. davidiana tatvanensis ssp. nov., and T. streetorum valid species. For the first time, we apply molecular species delimitation analyses to Talpa, confirming taxonomic hypotheses and suggesting the existence of further morphologically cryptic lineages. These analyses also support the recognition of T. transcaucasica as a valid species distinct from T. levantis. We present a revised phylogeny for Eurasian Talpa and increase the number of known extant taxa to 18, most of which are found in Anatolia, the global hotspot of diversity in this genus. This probably results from the isolation of suitable habitats by a combination of climatic and topographical heterogeneity.
... This introgressed mitochondrial DNA may artificially group the hybridizing species in a phylogeny. Mitochondrial introgression can also occur in nuclear genome-forming NUMT pseudogenes [59]. This phenomenon has been identified in several beetle groups [15,16] and is expected to be present across the order. ...
Article
Full-text available
DNA barcoding has revolutionized how we discover, identify, and detect species. A substantial foundation has been established with millions of mitochondrial cytochrome c oxidase I sequences freely available for eukaryotes. However, issues with COI ranging from uniparental inheritance and small genetic population sizes to nuclear and asymmetric introgression can impede its use. We propose using CAD as the “nuclear barcode” to complement the COI barcode and ameliorate these concerns. We focused on beetles from taxonomically diverse species-level studies that used COI and CAD. An ambiguous barcode gap was present between intra- and interspecific genetic distances in CAD and COI; this led to difficulty with automated gap detection methods. We found pseudogenes, problematic population structure, introgression, and incomplete lineage sorting represented in the COI data. A CAD gene tree illuminated these cryptic problems. Placement tests of species and outgroups using distance-based tree building were largely successful for CAD, demonstrating its phylogenetic signal at the species and genus levels. Species placement issues were typically unique to one locus, allowing for recognition of misdiagnosis. We conclude that a CAD barcode is a valuable tool for beetle diagnostics, metabarcoding, and faunistic surveys.
... Moreover, relying solely on mtDNA data for genetic analysis has some caveats that must be considered. These include the potential presence of mtDNA pseudogenes in the nuclear genome (Bensasson et al., 2001), the fact that mtDNA represents only a single locus and reflects the matrilinear history, and the higher lineage sorting rate and allele extinction rate of mtDNA lineages in comparison with nuclear data (D. X. Zhang & Hewitt, 2003). ...
Article
Mitochondrial (mtDNA) genes have served as widely utilised genetic loci for phylogenetic and phylogeographic studies of animals. However, the phylogenetic performance of many mtDNA genes has not been empirically evaluated across lineages within hymenopteran wasps. To address this question, we assembled and analysed mitogenomic data from social wasps, representing the four recognised tribes of Polistinae and all Epiponini genera. Additionally, we evaluated whether mtDNA gene order in Polistinae is congruent with its tribal classification. Using concatenation phylogenetic methods, we show phylo- genetic congruence between mitogenomic and nuclear data. Statistically comparing the phylogenetic performance of individual mtDNA genes, we demonstrate that for social wasps the molecular markers COI, 16S, NAD5, and NAD2 perform best, while ATP6, COII, and 12S show the worst results. Finally, we verified that the tRNA cluster close to the noncoding region is a hotspot of genetic rearrangements in Vespidae and can be used as additional information for the systematics of this group. Together, these results indicate that mitogenomes contain robust phylogenetic signal to elucidate the evolutionary history of Vespidae. Moreover, our study identifies the best choice of mtDNA markers for systematic investigations of social wasps.
... The nDNA is rich in sequences with high degree of homology with the mtDNA, the so called NUMTs (nuclear DNA mitochondrial sequences) ( 49 ). To confirm that the signal detected by mtG4-ChIP-seq protocol is specific for mtDNA and not the result of NUMTs pulldown, we depleted the mtDNA in the mitoBG4 cells by long-term ethidium bromide treatment ( 50 ). ...
Article
Full-text available
Mitochondrial DNA (mtDNA) replication stalling is considered an initial step in the formation of mtDNA deletions that associate with genetic inherited disorders and aging. However, the molecular details of how stalled replication forks lead to mtDNA deletions accumulation are still unclear. Mitochondrial DNA deletion breakpoints preferentially occur at sequence motifs predicted to form G-quadruplexes (G4s), four-stranded nucleic acid structures that can fold in guanine-rich regions. Whether mtDNA G4s form in vivo and their potential implication for mtDNA instability is still under debate. In here, we developed new tools to map G4s in the mtDNA of living cells. We engineered a G4-binding protein targeted to the mitochondrial matrix of a human cell line and established the mtG4-ChIP method, enabling the determination of mtDNA G4s under different cellular conditions. Our results are indicative of transient mtDNA G4 formation in human cells. We demonstrate that mtDNA-specific replication stalling increases formation of G4s, particularly in the major arc. Moreover, elevated levels of G4 block the progression of the mtDNA replication fork and cause mtDNA loss. We conclude that stalling of the mtDNA replisome enhances mtDNA G4 occurrence, and that G4s not resolved in a timely manner can have a negative impact on mtDNA integrity.
... As a result, it is not surprising that the Figure 4 Pearson correlation between the number of heteroplasmic variants and age in the 10 pairs of MZ twins with 0.5% threshold Figure 5. Pearson correlation between the number of heteroplasmic variants and age in the 10 pairs of MZ twins with 0.1% threshold number of heteroplasmic variations in MZ twins increases with age. However, nuclear mitochondrial DNA (NUMT), which are mtDNA fragments that have transferred to the nucleus, is a significant source of inaccuracy in the study of heteroplasmy [34,35]. The majority of NUMTs are found in the non-coding region and range in size from very tiny to exceedingly big, with chromosome 1 containing a representation of the whole mtDNA [12]. ...
Article
Full-text available
Differentiating between monozygotic (MZ) twins remains difficult because they have the same genetic makeup. Applying the traditional STR genotyping approach cannot differentiate one from the other. Heteroplasmy refers to the presence of two or more different mtDNA copies within a single cell and this phenomenon is common in humans. The levels of heteroplasmy cannot change dramatically during transmission in the female germ line but increase or decrease during germ-line transmission and in somatic tissues during life. As massively parallel sequencing (MPS) technology has advanced, it has shown the extraordinary quantity of mtDNA heteroplasmy in humans. In this study, a probe hybridization technique was used to obtain mtDNA and then MPS was performed with an average sequencing depth of above 4000. The results showed us that all ten pairs of MZ twins were clearly differentiated with the minor heteroplasmy threshold at 1.0%, 0.5%, and 0.1%, respectively. Finally, we used a probe that targeted mtDNA to boost sequencing depth without interfering with nuclear DNA and this technique can be used in forensic genetics to differentiate the MZ twins.
... However, for ancient and conserved numts relative to homologous mtDNA, universal primers may tend to amplify numts rather than mtDNA (Bensasson et al., 2001). For taxonomical groups with known numts, numts could be avoided by manually designing primers. ...
Article
Full-text available
Abstract Noninvasive genetic sampling greatly facilitates studies on the genetics, ecology, and conservation of threatened species. Species identification is often a prerequisite for noninvasive sampling‐based biological studies. Due to the low quantity and quality of genomic DNA from noninvasive samples, high‐performance short‐target PCR primers are necessary for DNA barcoding applications. The order Carnivora is characterized by an elusive habit and threatened status. In this study, we developed three pairs of short‐target primers for identifying Carnivora species. The COI279 primer pair was suitable for samples with better DNA quality. The COI157a and COI157b primer pairs performed well for noninvasive samples and reduced the interference of nuclear mitochondrial pseudogenes (numts). COI157a could effectively identify samples from Felidae, Canidae, Viverridae, and Hyaenidae, while COI157b could be applied to samples from Ursidae, Ailuridae, Mustelidae, Procyonidae, and Herpestidae. These short‐target primers will facilitate noninvasive biological studies and efforts to conserve Carnivora species.
... Once integrated into the nuclear genome, they are generally nonfunctional and subject to degradation (Hazkanicovo et al., 2010). Misidentifying numts as true mt DNA is a common cause of erroneous conclusions about phylogenetic relationships, biparental inheritance of mt genomes, and de novo mutations (Bensasson et al., 2001;Wu et al., 2020;Lutzbonengel et al., 2021). Therefore, we assessed whether our data were contaminated with numt sequences via additional analyses. ...
Article
Full-text available
There is growing evidence that cytonuclear incompatibilities (i.e. disruption of cytonuclear coadaptation) might contribute to the speciation process. In a former study, we described the possible involvement of plastid–nuclear incompatibilities in the reproductive isolation between four lineages of Silene nutans (Caryophyllaceae). Because organellar genomes are usually cotransmitted, we assessed whether the mitochondrial genome could also be involved in the speciation process, knowing that the gynodioecious breeding system of S. nutans is expected to impact the evolutionary dynamics of this genome. Using hybrid capture and high‐throughput DNA sequencing, we analyzed diversity patterns in the genic content of the organellar genomes in the four S. nutans lineages. Contrary to the plastid genome, which exhibited a large number of fixed substitutions between lineages, extensive sharing of polymorphisms between lineages was found in the mitochondrial genome. In addition, numerous recombination‐like events were detected in the mitochondrial genome, loosening the linkage disequilibrium between the organellar genomes and leading to decoupled evolution. These results suggest that gynodioecy shaped mitochondrial diversity through balancing selection, maintaining ancestral polymorphism and, thus, limiting the involvement of the mitochondrial genome in evolution of hybrid inviability between S. nutans lineages.
... nuclear non-functional copies of mitochondrial genes; Bensasson et al., 2001). Here again, NUMTs are supposed to be less abundant than mitochondrial sequences because they are present in a lower copy number within the cell (Andújar et al., 2021), which does not fit the pattern of our data set. ...
Article
Full-text available
Despite being the most important source of liquid freshwater on the planet, groundwater is severely threatened by climate change, agriculture, or industrial mining. It is thus extensively monitored for pollutants and declines in quantity. The organisms living in groundwater, however, are rarely the target of surveillance programmes and little is known about the fauna inhabiting underground habitats. The difficulties accessing groundwater, the lack of expertise, and the apparent scarcity of these organisms challenge sampling and prohibit adequate knowledge on groundwater fauna. Environmental DNA (eDNA) metabarcoding provides an approach to overcome these limitations but is largely unexplored. Here, we sampled water in 20 communal spring catchment boxes used for drinking water provisioning in Switzerland, with a high level of replication at both filtration and amplification steps. We sequenced a portion of the COI mitochondrial gene, which resulted in 4917 ASVs, yet only 3% of the reads could be assigned to a species, genus, or family with more than 90% identity. Careful evaluation of the unassigned reads corroborated that these sequences were true COI sequences belonging mostly to diverse eukaryotic groups, not present in the reference databases. Principal component analyses showed a strong correlation of the community composition with the surface land-use (agriculture vs. forest) and geology (fissured rock vs. unconsolidated sediment). While incomplete reference databases limit the assignment of taxa in groundwater eDNA metabarcoding, we showed that taxonomy-free approaches can reveal large hidden diversity and couple it with major land-use drivers, revealing their imprint on chemical and biological properties of groundwater.
... In addition, events of duplications of some mitochondrial genes [33] or the formation of nuclear-encoded, mitochondrial pseudogenes (NUMTs) [33,37] may also be interpreted erroneously as heteroplasmy. NUMTs are described as a transposition of mitochondrial DNA into the nuclear genome that can retain close homology to the original mitochondrial genes [38]. ...
Conference Paper
Clásicamente, se ha considerado que las mitocondrias se heredaban por vía materna y que las herencias paternas o biparentales eran la excepción. Sin embargo, esta consideración es cada vez más cuestionada, entre otros motivos, porque los nuevos métodos de secuenciación masiva o de qPCR han permitido detectar pequeñas variaciones en el DNA mitocondrial en multitud de organismos. Estas excepciones a la herencia materna tienen como resultado diferentes niveles de heteroplasmia, la cual puede ser debida a cinco motivos: mutaciones de novo, eventos de recombinación, herencia paterna, herencia biparental o doble herencia uniparental. Dentro de los crustáceos decápodos, hemos detectado heteroplasmia en centolla (Maja brachydactyla) a una elevada frecuencia, la cual no se explicaría por errores puntuales de eliminación de las mitocondrias masculinas. Debido a la coexistencia de dos especies congenéricas de Maja (M. squinado y M. brachydactyla) en el sur de la Península Ibérica, propusimos la posible hibridación entre estas dos especies, fenómeno que dificultaría la eliminación de las mitocondrias paternas. Si esta hipótesis fuese cierta, la proporción de individuos heteroplásmicos debería disminuir septentrionalmente. Sin embargo, este supuesto no se cumple al estudiar poblaciones de diferentes latitudes. Por otra parte, análisis iniciales de cruzamientos realizados en cautividad muestran, en algunos casos, herencia biparental. En esta presentación, planteamos otras excepciones encontradas a la herencia mitocondrial en decápodos y discutimos sus posibles causas.
... One explanation for this divergence is that the duplicated gene is not translated due to factors not apparent in the mitochondrial genome sequence and has accumulated mutations that, although not resulting in an interrupting stop codon in the sequence, left the sequence functional, but not translated. Effectively the result is a pseudogene, a phenomenon that occurs often in genomic evolution (Balakirev and Ayala 2003;Bensasson et al. 2001;van der Burgt et al. 2014). This hypothesis is also a likely explanation for the duplicate ATP9 seen in S. pileatum, which lacked a discernable stop codon. ...
Article
Variation in mitochondrial genome composition across intraspecific, interspecific, and higher taxonomic scales has been little studied in lichen obligate symbioses. Cladonia is one of the most diverse and ecologically important lichen genera, with over 500 species representing an array of unique morphologies and chemical profiles. Here, we assess mitochondrial genome diversity and variation in this flagship genus, with focused sampling of two clades of the “true” reindeer lichens, Cladonia subgenus Cladina, and additional genomes from nine outgroup taxa. We describe composition and architecture at the gene and the genome scale, examining patterns in organellar genome size in larger taxonomic groups in Ascomycota. Mitochondrial genomes of Cladonia, Pilophorus, and Stereocaulon were consistently larger than those of Lepraria and contained more introns, suggesting a selective pressure in asexual morphology in Lepraria driving it toward genomic simplification. Collectively, lichen mitochondrial genomes were larger than most other fungal life strategies, reaffirming the notion that coevolutionary streamlining does not correlate to genome size reductions. Genomes from Cladonia ravenelii and Stereocaulon pileatum exhibited ATP9 duplication, bearing paralogs that may still be functional. Homing endonuclease genes (HEGs), though scarce in Lepraria, were diverse and abundant in Cladonia, exhibiting variable evolutionary histories that were sometimes independent of the mitochondrial evolutionary history. Intraspecific HEG diversity was also high, with C. rangiferina especially bearing a range of HEGs with one unique to the species. This study reveals a rich history of events that have transformed mitochondrial genomes of Cladonia and related genera, allowing future study alongside a wealth of assembled genomes.
... Our reconstruction of phylogenies embedding NUMTs with their mitochondrial counterparts exposed scenarios that strikingly contradicted expectations: except for cox1, NUMTs rarely branch with the original mitochondrial region, particularly the different copies of nad2. In the context of phylogeographic or DNA barcoding, this study serves primarily to reinforce the growing concerns of NUMT amplification with mtDNA-designed primers (Bensasson, Zhang, Hartl, & Hewitt, 2001;Yao, Kong, Salas, & Bandelt, 2008). For instances, in species with high genetic diversity, a researcher might accept similarity percentage thresholds in the order of those we here utilize to define NUMTs and thus report larger numbers of mitochondrial haplotypes (Bertheau et al., 2011). ...
Article
Full-text available
Mito-nuclear insertions, or NUMTs, relate to genetic material of mitochondrial origin that have been transferred to the nuclear DNA molecule. The increasing amounts of genomic data currently being produced presents an opportunity to investigate this type of patterns in genome evolution of non-model organisms. Identifying NUMTs across a range of closely related taxa allows one to generalize patterns of insertion and maintenance in autosomes, which is ultimately relevant to the understanding of genome biology and evolution. Here we collected existing pairwise genome-mitogenome data of the order Strigiformes, a group that includes all the nocturnal bird predators. We identified NUMTs by applying percent similarity thresholds after blasting mitochondrial genomes against nuclear genome assemblies. We identified NUMTsin all genomes with numbers ranging from 4 in Bubo bubo to 24 in Ciccaba nigrolineata. Statistical analyses revealed NUMT size to negatively correlate with NUMT's sequence similarity to with original mtDNA region. Lastly, characterizing these nuclear insertions of mitochondrial origin in a comparative genomics framework produced variable phylogenetic patterns, suggesting in some cases that insertions might pre-date speciation events within Strigiformes.
... The amplification of these nuclear sequences, instead of or in addition to the mitochondrial sequence, can lead to ambiguous sequences, incorrect phylogenetic replacements or misinterpretation as frameshift mutation. Numts can be detected by checking for the occurrence of these effects, thus preventing erroneous results [71,72]. ...
Article
Full-text available
Food adulteration is one of the most serious problems regarding food safety and quality worldwide. Besides misleading consumers, it poses a considerable health risk associated with the potential non-labeled allergen content. Fish and fish products are one of the most expensive and widely traded commodities, which predisposes them to being adulterated. Among all fraud types, replacing high-quality or rare fish with a less valuable species predominates. Because fish differ in their allergen content, specifically the main one, parvalbumin, their replacement can endanger consumers. This underlines the need for reliable, robust control systems for fish species identification. Various methods may be used for the aforementioned purpose. DNA-based methods are favored due to the characteristics of the target molecule, DNA, which is heat resistant, and the fact that through its sequencing, several other traits, including the recognition of genetic modifications, can be determined. Thus, they are considered to be powerful tools for identifying cases of food fraud. In this review, the major DNA-based methods applicable for fish meat and product authentication and their commercial applications are discussed, the possibilities of detecting genetic modifications in fish are evaluated, and future trends are highlighted, emphasizing the need for comprehensive and regularly updated online database resources.
... Surprisingly, the size of this NUMT was also almost equal to the size of the whole mtDNA. NUMTs were discovered approximately two decades ago, and they are often characterized as mitochondrial pseudogenes, since they are copies of the mitochondrial genome located on the nuclear genome [14]. As Parakatselaki et al. (2022) [13] highlighted, these NUMTs can lead to false results and cause confusion about the biparental inheritance of mtDNA. ...
Article
Full-text available
Theodosius Dobzhansky famously wrote in 1973 that “nothing in biology makes sense except in the light of evolution” [...]
... This problem can have several origins such as PCR errors and PCR-induced chimeras (Potapov & Ong, 2017) or the amplification of pseudogenes (Buhay, 2009). These are particularly problematic when targeting mitochondrial markers because of the presence of nonfunctional copies of mitochondrial genes in the nuclear genome (NUMTs; Bensasson et al., 2001). These copies, although in theory less abundant than the targeted marker, can accumulate mutations and be as divergent as 36% from their parent sequence (Schultz & Hebert, 2022). ...
Article
Assessment of biodiversity using metabarcoding data, such as from bulk‐ or eDNA sampling, is becoming increasingly relevant in ecology, biodiversity sciences, and monitoring. Thereby, the taxonomic identification of species from their DNA sequences relies strongly on reference databases that link genetic sequences to taxonomic names. These databases vary in completeness and availability, depending on the taxonomic group studied and the genetic region targeted. The incompleteness of reference databases is an important argument to explain the non‐detection by metabarcoding of species supposedly present. However, there exist further and generally overlooked problems with reference databases that can lead to false or inaccurate taxonomic assignment inferences. Here, we synthesize all possible problems inherent to reference databases. In particular, we identify a complete, mutually non‐exclusive list of seven classes of challenges when it comes to selecting, developing, and using a reference database for taxonomic assignment. These are: 1) mislabeling, 2) sequencing errors, 3) sequence conflict, 4) taxonomic conflict, 5) low taxonomic resolution, 6) missing taxon, and 7) missing intraspecific variant. For each problem identified, we provide a description of possible consequences on the taxonomic assignment process. We illustrate the respective problem with examples taken from the literature or obtained by quantitative analyses of public databases, such as Genbank or BOLD. Finally, we discuss possible solutions to the identified problems and how to navigate them. Only by raising users’ awareness of the limitations of metabarcoding data and DNA‐reference databases, adequate interpretations of these data will be achieved.
... Importantly, no mt-RNAs were wrongly accepted and the rejected reads were composed mainly of short reads ( Figure 2C, Supplementary Figure 3A). Reads mapping to the other chromosomes were mainly increased ( Figure 2B, Supplementary Figure 3A), with the exception of chromosome 1, where many mitochondrial pseudogenes are located in mice [13]. In total, 99% of rejected reads mapped to the mitochondrial chromosome and chromosome 1 (Supplementary Table 1). ...
Preprint
Full-text available
ONT long-read sequencing provides real-time monitoring and controlling of individual nanopores. Adaptive sampling enriches or depletes specific sequences in Nanopore DNA sequencing, but was not applicable to direct sequencing of RNA so far. Here, we identify essential parameter settings for direct RNA sequencing (DRS). We demonstrate the superior performance of depletion over enrichment and show that adaptive sampling efficiently depletes specific transcripts in transcriptome-wide sequencing applications. Specifically, we applied our adaptive sampling approach to polyA+ RNA samples from human cardiomyocytes and mouse whole heart tissue. Herein, we show more than 2.5-fold depletion of highly abundant mitochondrial-encoded transcripts that in normal sequencing account for up to 40% of sequenced bases in heart tissue samples.
... Typically, the high level of differentiation we saw in B. microptera is characteristic of allopatric populations that have undergone long periods of isolation, differing levels of selection, and/or high levels of genetic drift (Coyne and Orr, 2004). Many scenarios can lead to highly divergent sympatric mitochondrial lineages, such as mitochondrial pseudogenes inserted into the nuclear genome (numts) (Benasson et al., 2001), the presence of cryptic species, hybridization or some other driver of a non-random mating, or contact between longisolated populations. We ruled out the presence of numts since we did not observe frameshifts, stop codons, or double peaks in mitochondrial sequence data. ...
Article
Full-text available
The study of evolution and speciation in non-model systems provides us with an opportunity to expand our understanding of biodiversity in nature. Connectivity studies generally focus on species with obvious boundaries to gene flow, but in open-ocean environments, such boundaries are difficult to identify. Due to the lack of obvious boundaries, speciation and population subdivision in the pelagic environment remain largely unexplained. Comb jellies (Phylum Ctenophora) are mostly planktonic gelatinous invertebrates, many of which are considered to have freely interbreeding distributions worldwide. It is thought that the lobate ctenophore Bolinopsis infundibulum is distributed throughout cooler northern latitudes and B. vitrea warmer. Here, we examined the global population structure for species of Bolinopsis with genetic and morphological data. We found distinct evolutionary patterns within the genus, where B. infundibulum had a broad distribution from northern Pacific to Atlantic waters despite many physical barriers, while other species were geographically segregated despite few barriers. Divergent patterns of speciation within the genus suggest that oceanic currents, sea-level, and geological changes over time can act as either barriers or aids to dispersal in the pelagic environment. Further, we used population genomic data to examine evolution in the open ocean of a distinct lineage of Bolinopsis ctenophores from the North Eastern Pacific. Genetic information and morphological observations validated this as a separate species, Bolinopsis microptera, which was previously described but has recently been called B. infundibulum. We found that populations of B. microptera from California were in cytonuclear discordance, which indicates a secondary contact zone for previously isolated populations. Discordance at this scale is rare, especially in a continuous setting.
Article
Full-text available
Populations of Eurasian otters Lutra lutra, one of the most widely distributed apex predators in Eurasia, have been depleted mainly since the 1950s. However, a lack of information about their genomic diversity and how they are organized geographically in East Asia severely impedes our ability to monitor and conserve them in particular management units. Here, we re-sequenced and analyzed 20 otter genomes spanning continental East Asia, including a population at Kinmen, a small island off the Fujian coast, China. The otters form three genetic clusters (one of L. l. lutra in the north and two of L. l. chinensis in the south), which have diverged in the Holocene. These three clusters should be recognized as three conservation management units to monitor and manage independently. The heterozygosity of the East Asian otters is as low as that of the threatened carnivores sequenced. Historical effective population size trajectories inferred from genomic variations suggest that their low genomic diversity could be partially attributed to changes in the climate since the mid-Pleistocene and anthropogenic intervention since the Holocene. However, no evidence of genetic erosion, mutation load, or high level of inbreeding was detected in the presumably isolated Kinmen Island population. Any future in situ conservation efforts should consider this information for the conservation management units.
Preprint
Individual sorting and identification of thousands of insects collected in mass trapping biosurveillance programs is a labor intensive and time-consuming process. Metabarcoding, which allows for the simultaneous identification of multiple individuals in a single mixed sample, has the potential to expedite this process. However, detecting all the species present in a bulk sample can be challenging, especially when under-represented non-native specimens were intercepted. In this study, we quantified the effectiveness of metabarcoding at detecting exotic species within six different mock communities including or not native and non-native species of European xylophagous cerambycid beetles. Although we did not observe significant differences in the total number of species detected between MinION, Illumina, and IonTorrent sequencing technologies, a greater number of individuals was detected and identified to species using MinION, including the detection of three non-native cerambycids. The three sequencing technologies also showed similar results in detecting and identifying closely related species and species at low abundance. The capture method appears to greatly influence sample preservation and detection. Indeed, individuals captured in traps containing monopropylene and water had both lower DNA concentration leading to lower species detection rates compared to individuals killed using just an insecticide without any collection medium.
Article
Full-text available
The mountain bumblebees of the subgenus Alpigenobombus Skorikov, 1914, are uniquely distinctive because the females have enlarged mandibles with six large, evenly spaced teeth, which they use to bite holes in long-corolla flowers for nectar robbing. Recognition of species in this subgenus has been uncertain, with names used in various combinations. To revise the species, we examined COIlike barcodes for evidence of species’ gene coalescents using MrBayes and PTP and we compare the coalescent groups with morphological variation for integrative assessment. While we seek to include only orthologous barcodes (the ‘good’) and exclude all of the more strongly divergent barcode-like numts (the ‘bad’), for some nominal taxa only low-divergence numts could be obtained (the ‘ugly’). For taxa with no orthologous sequences available, using a minimum number of the lowest divergence numts did yield coalescent candidates for species that were consistent with morphologically diagnosable groups. These results agree in recognising 11 species within this subgenus, supporting: (1) recognising the widespread European Bombus mastrucatus Gerstaecker, 1869 stat. rev. as a species separate from the west Asian B. wurflenii Radoszkowski, 1860 s. str.; (2) the recently recognised B. rainai Williams, 2022, as a species separate from B. kashmirensis Friese, 1909, within the western Himalaya; (3) the recognition once again of B. sikkimi Friese, 1918 stat. rev. and B. validus Friese, 1905 stat. rev. as species separate from B. nobilis Friese, 1905 s. str. within the eastern Himalaya and Hengduan regions; (4) confirming the recognition of B. angustus Chiu, 1948, B. breviceps Smith, 1852 s. lat., B. genalis Friese, 1918, and B. grahami (Frison, 1933) as separate species within the Himalaya, China, and Southeast Asia; (5) recognising the conspecificity of the nominal taxa (not species) channicus Gribodo, 1892 (Southeast Asia) and dentatus Handlirsch, 1888 (Himalaya) as parts of the species B. breviceps s. lat. (southern and eastern China); and (6) recognising the conspecificity of the rare taxon beresovskii (Skorikov, 1933) syn. n. as part of the species B. grahami within China. Nectar robbing by bumblebees is reviewed briefly and prospects for future research discussed.
Article
Nanopore long-read sequencing enables real-time monitoring and controlling of individual nanopores. This allows to enrich or deplete specific sequences in DNA sequencing in a process called “adaptive sampling”. So far, adaptive sampling was not applicable to direct sequencing of RNA. Here, we show that adaptive sampling is feasible and useful for direct RNA sequencing, which has its specific technical and biological challenges. Employing a well-controlled in vitro transcript-based model system, we identify essential characteristics and parameter settings for adaptive sampling in direct RNA sequencing, as the superior performance of depletion over enrichment. Here, the efficiency of depletion is close to the theoretical maximum. Additionally, we demonstrate that adaptive sampling efficiently depletes specific transcripts in transcriptome-wide sequencing applications. Specifically, we applied our adaptive sampling approach to polyA-enriched RNA samples from human induced pluripotent stem cell-derived cardiomyocytes and mouse whole heart tissue and show efficient 2.5 to 2.8 fold depletion of highly abundant mitochondrial-encoded transcripts. Finally, we characterize depletion and enrichment performance for complex transcriptome subsets i.e. at the level of the entire Chromosome 11, proving the general applicability of direct RNA adaptive sampling. Our analyses provide evidence that adaptive sampling is especially useful to enable detection of lowly expressed transcripts and reduce the sequencing of highly abundant disturbing transcripts. Workflow and sequencing data are provided on Zenodo: https://doi.org/10.5281/zenodo.7701823
Article
Full-text available
DNA barcoding represents a handy tool for species identification. In addition, it serves as a complementary approach that improves the characterisation of evolutionary lineages and facilitates the detection of potentially undescribed and cryptic species. Based on the case study in the Western Carpathians, which belong to the Carpathian biodiversity hotspot, we have compiled the first DNA barcode reference library for molecular identification of invertebrates associated with epikarst, a unique, yet understudied, shallow subterranean aquatic habitat that extends at the interface between the soil and carbonate rocks. We analysed invertebrates collected between 2019 and 2020 from epikarst water that continuously seeps into four caves of the Demänovský Cave System in northern Slovakia. The standard barcode marker of the mitochondrial COI gene was amplified in more than 920 individuals of aquatic, semi-aquatic, and terrestrial invertebrates. The final data set consisted of 784 barcode sequences representing 36 morphospecies, the majority (98.3%) belonged to Arthropoda. Automated cluster delineation using the Barcode of Life Data System (BOLD) revealed 60 Barcode Index Numbers (BINs), of which 43 BINs were new to BOLD, representing mostly typical subterranean species. Almost 20% of the morphospecies displayed high intraspecific variation (>2.2%), suggesting the need for further investigation to assess potential taxonomic problems or cryptic diversity. Our results also indicated the existence of several yet undescribed invertebrate species and possible heteroplasmy or COI numts in the collembolan Megalothorax sp. (incertus species group). The resulting DNA barcode library represents a significant advance not only in the characterisation of epikarst biodiversity but also in the understanding of subterranean biodiversity in general, paving the way for future complex evolutionary and biogeographical studies.
Preprint
Full-text available
Toxoplasma gondii is a zoonotic protist pathogen that infects up to 1/3 of the human population. This apicomplexan parasite contains three genome sequences: nuclear (63 Mb); plastid organellar, ptDNA (35 kb); and mitochondrial organellar, mtDNA (5.9 kb of non-repetitive sequence). We find that the nuclear genome contains a significant amount of NUMTs (nuclear DNA of mitochondrial origin) and NUPTs (nuclear DNA of plastid origin) that are continuously acquired and represent a significant source of intraspecific genetic variation. NUOT (nuclear DNA of organellar origin) accretion has generated 1.6% of the extant T. gondii ME49 nuclear genome; the highest fraction ever reported in any organism. NUOTs are primarily found in organisms that retain the non-homologous end-joining repair pathway. Significant movement of organellar DNA was experimentally captured via amplicon sequencing of a CRISPR-induced double-strand break in non-homologous end-joining repair competent, but not ku80 mutant, Toxoplasma parasites. Comparisons with Neospora caninum , a species that diverged from Toxoplasma ∼28 MY ago, revealed that the movement and fixation of 5 NUMTs predates the split of the two genera. This unexpected level of NUMT conservation suggests evolutionary constraint for cellular function. Most NUMT insertions reside within (60%) or nearby genes (23% within 1.5 kb) and reporter assays indicate that some NUMTs have the ability to function as cis-regulatory elements modulating gene expression. Together these findings portray a role for organellar sequence insertion in dynamically shaping the genomic architecture and likely contributing to adaptation and phenotypic changes in this important human pathogen. Significance Statement This study reveals how DNA located in cellular compartments called organelles can be transferred to the nucleus of the cell and inserted into the nuclear genome of apicomplexan parasite Toxoplasma . Insertions alter the DNA sequence and may lead to significant changes in how genes function. Unexpectedly, we found that the human protist pathogen, Toxoplasma gondii and closely-related species have the largest observed organellar genome fragment content (>11,000 insertion comprising over 1 Mb of DNA) inserted into their nuclear genome sequence despite their compact 65 Mb nuclear genome. Insertions are occurring at a rate that makes them a significant mutational force that deserves further investigation when examining causes of adaptation and virulence of these parasites.
Preprint
It is increasing evident that demographic history and phylogeographic consequences of past climate changes were unfolded locally and varied from region to region. Despite the high Murinae rodent species richness and endemism in the low latitude Asia, how the past climatic fluctuations shaped the phylogeographic and demographic history in this area remains unknown. Here we trapped 253 field Murine individuals and successfully amplified COI gene sequence for DNA barcoding. The phylo-genetic tree showed the Murinae diversification included ten species belong to Rattini and Murini tribes. The divergence dating suggested that the most common ancestor (TMRCA) of each rodent species was estimated in Early or Middle Pleistocene. Bayesian skyline plot (BSP) exhibited the onset of population growth of seven Murinae rodents occurred at penultimate or last glaciation, and while R. losea and R. norvegicus keep effective population size constant through their elapsed time. Addi-tionally, the six rodent species range of refugia area in the LGM projected by ecological niche models (ENMs) exhibited expander than the suitable area on present, meanwhile the remaining four rodent species showed contracted refugia regions. Hence, our results suggested that the rodent community displayed the asynchronous demographic and phylogeographic dynamics in the low latitude Asia.
Article
Full-text available
Inserts of DNA from extranuclear sources, such as organelles and microbes, are common in eukaryote nuclear genomes. However, sequence similarity between the nuclear and extranuclear DNA, and a history of multiple insertions, make the assembly of these regions challenging. Consequently, the number, sequence, and location of these vagrant DNAs cannot be reliably inferred from the genome assemblies of most organisms. We introduce two statistical methods to estimate the abundance of nuclear inserts even in the absence of a nuclear genome assembly. The first (intercept method) only requires low-coverage (<1x) sequencing data, as commonly generated for population studies of organellar and ribosomal DNAs. The second method additionally requires that a subset of the individuals carry extra-nuclear DNA with diverged genotypes. We validated our intercept method using simulations and by re-estimating the frequency of human NUMTs (nuclear mitochondrial inserts). We then applied it to the grasshopper Podisma pedestris, exceptional for both its large genome size and reports of numerous NUMT inserts, estimating that NUMTs make up 0.056% of the nuclear genome, equivalent to >500 times the mitochondrial genome size. We also re-analysed a museomics dataset of the parrot Psephotellus varius, obtaining an estimate of only 0.0043%, in line with reports from other species of bird. Our study demonstrates the utility of low-coverage high-throughput sequencing data for the quantification of nuclear vagrant DNAs. Beyond quantifying organellar inserts, these methods could also be used on endosymbiont-derived sequences. We provide an R implementation of our methods called "vagrantDNA" and code to simulate test datasets.
Preprint
Full-text available
Inserts of DNA from extranuclear sources, such as organelles and microbes, are common in eukaryote nuclear genomes. However, sequence similarity between the nuclear and extranuclear DNA, and a history of multiple insertions, make the assembly of these regions challenging. Consequently, the number, sequence, and location of these vagrant DNAs cannot be reliably inferred from the genome assemblies of most organisms. We introduce two statistical methods to estimate the abundance of nuclear inserts even in the absence of a nuclear genome assembly. The first (intercept method) only requires low-coverage (<1x) sequencing data, as commonly generated for population studies of organellar and ribosomal DNAs. The second method additionally requires that a subset of the individuals carry extra-nuclear DNA with diverged genotypes. We validated our intercept method using simulations and by re-estimating the frequency of human NUMTs (nuclear mitochondrial inserts). We then applied it to the grasshopper Podisma pedestris , exceptional for both its large genome size and reports of numerous NUMT inserts, estimating that NUMTs make up 0.056% of the nuclear genome, equivalent to >500 times the mitochondrial genome size. We also re-analysed a museomics dataset of the parrot Psephotellus varius , obtaining an estimate of only 0.0043%, in line with reports from other species of bird. Our study demonstrates the utility of low-coverage high-throughput sequencing data for the quantification of nuclear vagrant DNAs. Beyond quantifying organellar inserts, these methods could also be used on endosymbiont-derived sequences. We provide an R implementation of our methods called “vagrantDNA” and code to simulate test datasets.
Article
Full-text available
The integration of mitochondrial genome fragments into the nuclear genome is well documented, and the transfer of these mitochondrial nuclear pseudogenes (numts) is thought to be an ongoing evolutionary process. With the increasing number of eukaryotic genomes available, genome-wide distributions of numts are often surveyed. However, inconsistencies in genome quality can reduce the accuracy of numt estimates, and methods used for identification can be complicated by the diverse sizes and ages of numts. Numts have been previously characterized in rodent genomes and it was postulated that they might be more prevalent in a group of voles with rapidly evolving karyotypes. Here, we examine 37 rodent genomes, and an additional 26 vertebrate genomes, while also considering numt detection methods. We identify numts using DNA:DNA and protein:translated-DNA similarity searches and compare numt distributions among rodent and vertebrate taxa to assess whether some groups are more susceptible to transfer. A combination of protein sequence comparisons (protein:translated-DNA) and BLASTN genomic DNA searches detect 50% more numts than genomic DNA:DNA searches alone. In addition, higher-quality RefSeq genomes produce lower estimates of numts than GenBank genomes, suggesting that lower quality genome assemblies can overestimate numts abundance. Phylogenetic analysis shows that mitochondrial transfers are not associated with karyotypic diversity among rodents. Surprisingly, we did not find a strong correlation between numt counts and genome size. Estimates using DNA: DNA analyses can underestimate the amount of mitochondrial DNA that is transferred to the nucleus.
Article
Full-text available
In contrast to extensive infiltration of plant nuclear genomes by mitochondrial and chloroplast DNA fragments, a computer assessment method could only detect seven mitochondrial DNA integration events in Saccharomyces cerevisiae chromosomes and five examples of DNA migration into mammalian nuclear genes. No evidence could be detected for mitochondrial DNA insertion into chromosome III of Caenorhabditis elegans or in nuclear DNA sequences of Drosophila sp. or Plasmodium falciparum. Thus, the quantity of organellar DNA in the nucleus appears to vary amongst organisms and is lower in Saccharomyces cerevisiae than suggested by experimental plasmid systems. As in plants, migratory mitochondrial DNA fragments in yeast and mammals are found in intergenic regions and introns. Although many of these insertions are located near retroelements, mitochondrial DNA incorporation appears to be independent of retroelement insertion. Comparison of the mitochondrial DNA fragments with mitochondrial transcription maps suggest that two fragments may have transposed through DNA-based and one through RNA-based mechanisms. Analyses of the integration sites indicate that organellar DNA sequences are incorporated by an end-joining mechanism common to yeast, mammals, and plants. The transferred sequences also provide a novel perspective on rates and patterns of nucleotide substitution. Analysis of the D-loop region including a nuclear copy of mitochondrial DNA supports a progressive reduction in D-loop length within both monkey and great apes mitochondrial lineages. Relative distance tests polarized with nuclear copies of the mitochondrial 12S/16S rRNA region suggest that a constant number of transversions has accumulated within the great ape clade, but the number of transitions in orangutan is elevated with respect to members of the human/chimp/gorilla clade. In addition to DNA migration events, 29 nuclear/mitochondrial genes were identified in GenBank that appear to result from inadvertent ligation of nuclear and mitochondrial mRNA transcripts during the cloning process.
Article
Full-text available
We cloned and sequenced a segment of mitochondrial DNA from human, chimpanzee, gorilla, orangutan, and gibbon. This segment is 896 bp in length, contains the genes for three transfer RNAs and parts of two proteins, and is homologous in all 5 primates. The 5 sequences differ from one another by base substitutions at 283 positions and by a deletion of one base pair. The sequence differences range from 9 to 19% among species, in agreement with estimates from cleavage map comparisons, thus confirming that the rate of mtDNA evolution in primates is 5 to 10 times higher than in nuclear DNA. The most striking new finding to emerge from these comparisons is that transitions greatly outnumber transversions. Ninety-two percent of the differences among the most closely related species (human, chimpanzee, and gorilla) are transitions. For pairs of species with longer divergence times, the observed percentage of transitions falls until, in the case of comparisons between primates and non-primates, it reaches a value of 45. The time dependence is probably due to obliteration of the record of transitions by multiple substitutions at the same nucleotide site. This finding illustrates the importance of choosing closely related species for analysis of the evolutionary process. The remarkable bias toward transitions in mtDNA evolution necessitates the revision of equations that correct for multiple substitutions at the same site. With revised equations, we calculated the incidence of silent and replacement substitutions in the two protein-coding genes. The silent substitution rate is 4 to 6 times higher than the replacement rate, indicating strong functional constraints at replacement sites. Moreover, the silent rate for these two genes is about 10% per million years, a value 10 times higher than the silent rate for the nuclear genes studied so far. In addition, the mean substitution rate in the three mitochondrial tRNA genes is at least 100 times higher than in nuclear tRNA genes. Finally, genealogical analysis of the sequence differences supports the view that the human lineage branched off only slightly before the gorilla and chimpanzee lineages diverged and strengthens the hypothesis that humans are more related to gorillas and chimpanzees than is the orangutan. Peer Reviewed http://deepblue.lib.umich.edu/bitstream/2027.42/48036/1/239_2005_Article_BF01734101.pdf
Article
Full-text available
A nuclear integration of a mitochondrial control region sequence on human chromosome 9 has been isolated. PCR analyses with primers specific for the respective insertion-flanking nuclear regions showed that the insertion took place on the lineage leading to Hominoidea (gibbon, orangutan, gorilla, chimpanzee, and human) after the Old World monkey-Hominoidea split. The sequences of the control region integrations were determined for humans, chimpanzees, gorillas, orangutans, and siamangs. These sequences were then used to construct phylogenetic trees with different methods, relating them with several hominoid, Old Work monkey, and New World monkey mitochondrial control region sequences. Applying maximum-likelihood, neighbor-joining, and parsimony algorithms, the insertion clade was attached to the branch leading to the hominoid mitochondrial sequences as expected from the PCR-determined presence/absence of this integration. An unexpected long branch leading to the internal node that connects all insertion sequences was observed for the different phylogeny reconstruction procedures. This finding is not totally compatible with the lower evolutionary rate in the nucleus than in the mitochondrial compartment. We determined the unambiguous substitutions on the branch leading to the most recent common ancestor (MRCA) of the mitochondrial inserts according to the parsimony criterium. We propose that they are unlikely to have been caused by damage of the transposing nucleic acid and that they are probably due to a change in the evolutionary mode after the transposition.
Article
Full-text available
A surprisingly large number of plant nuclear DNA sequences inferred to be remnants of chloroplast and mitochondrial DNA migration events were detected through computer-assisted database searches. Nineteen independent organellar DNA insertions, with a median size of 117 bp (range of 38 to > 785 bp), occur in the proximity of 15 nuclear genes. One fragment appears to have been passed through a RNA intermediate, based on the presence of an edited version of the mitochondrial gene in the nucleus. Tandemly arranged fragments from disparate regions of organellar genomes and from different organellar genomes indicate that the fragments joined together from an intracellular pool of RNA and/or DNA before they integrated into the nuclear genome. Comparisons of integrated sequences to genes lacking the insertions, as well as the occurrence of coligated fragments, support a model of random integration by end joining. All transferred sequences were found in noncoding regions, but the positioning of organellar-derived DNA in introns, as well as regions 5' and 3' to nuclear genes, suggests that the random integration of organellar DNA has the potential to influence gene expression patterns. A semiquantitative estimate was performed on the amount of organellar DNA being transferred and assimilated into the nucleus. Based on this database survey, we estimate that 3-7% of the plant nuclear genomic sequence files contain organellar-derived DNA. The timing and the magnitude of genetic flux to the nuclear genome suggest that random integration is a substantial and ongoing process for creating sequence variation.
Article
Full-text available
The mitochondrial DNA of plant and animal cells is a transcriptionally active genome that traces its origins to a symbiotic infection of eucaryotic cells by bacterial progenitors. As prescribed by the Serial Endosymbiosis Theory, symbiotic organelles have gradually transferred their genes to the eucaryotic genome, producing a functional interaction of nuclear and mitochondrial genes in organelle function. We report here a recent remarkable transposition of 7.9 kb of a typically 17.0-kb mitochondrial genome to a specific nuclear chromosomal position in the domestic cat. The intergrated segment has subsequently become amplified 38-76 times and now occurs as a tandem repeat macrosatellite with multiple-length alleles resolved by pulse-field gel electrophoresis (PFGE) segregating in cat populations. Sequence determination of the nuclear mitochondrial DNA segment, Numt, revealed a d(CA)-rich 8-bp motif [ACACACGT] repeated imperfectly five times at the deletion junction that is a likely target for recombination. The extent and pattern of sequence divergence of Numt genes from the cytoplasmic mtDNA homologues plus the occurrence of Numt in other species of the family Felidae allowed an estimate for the origins of Numt at 1.8-2.0 million years ago in an ancestor of four modern species in the genus Felis. Numt genes do not function in cats; rather, the locus combines properties of nuclear minisatellites and pseudogenes. These observations provide an empirical glimpse of historic genomic events that may parallel the accommodation of organelles in eucaryotes.
Article
Full-text available
Polymerase chain reaction (PCR) products corresponding to 803 bp of the cytochrome oxidase subunits I and II region of mitochondrial DNA (mtDNA COI-II) were deduced to consist of multiple haplotypes in three Sitobion species. We investigated the molecular basis of these observations. PCR products were cloned, and six clones from one individual per species were sequenced. In each individual, one sequence was found commonly, but also two or three divergent sequences were seen. The divergent sequences were shown to be nonmitochondrial by sequencing from purified mtDNA and Southern blotting experiments. All seven nonmitochondrial clones sequenced to completion were unique. Nonmitochondrial sequences have a high proportion of unique sites, and very few characters are shared between nonmitochondrial clones to the exclusion of mtDNA. From these data, we infer that fragments of mtDNA have been transposed separately (probably into aphid chromosomes), at a frequency only known to be equalled in humans. The transposition phenomenon appears to occur infrequently or not at all in closely related genera and other aphids investigated. Patterns of nucleotide substitution in mtDNA inferred over a parsimony tree are very different from those in transposed sequences. Compared with mtDNA, nonmitochondrial sequences have less codon position bias, more even exchanges between A, G, C and T, and a higher proportion of nonsynonymous replacements. Although these data are consistent with the transposed sequences being under less constraint than mtDNA, changes in the nonmitochondrial sequences are not random: there remains significant position bias, and probable excesses of synonymous replacements and of conservative inferred amino acid replacements. We conclude that a proportion of the inferred change in the nonmitochondrial sequences occurred before transposition. We believe that Sitobion aphids (and other species exhibiting mtDNA transposition) may be important for studying the molecular evolution of mtDNA and pseudogenes. However, our data highlight the need to establish the true evolutionary relationships between sequences in comparative investigations.
Article
Full-text available
In contrast to extensive infiltration of plant nuclear genomes by mitochondrial and chloroplast DNA fragments, a computer assessment method could only detect seven mitochondrial DNA integration events in Saccharomyces cerevisiae chromosomes and five examples of DNA migration into mammalian nuclear genes. No evidence could be detected for mitochondrial DNA insertion into chromosome III of Caenorhabditis elegans or in nuclear DNA sequences of Drosophila sp. or Plasmodium falciparum. Thus, the quantity of organellar DNA in the nucleus appears to vary amongst organisms and is lower in Saccharomyces cerevisiae than suggested by experimental plasmid systems. As in plants, migratory mitochondrial DNA fragments in yeast and mammals are found in intergenic regions and introns. Although many of these insertions are located near retroelements, mitochondrial DNA incorporation appears to be independent of retroelement insertion. Comparison of the mitochondrial DNA fragments with mitochondrial transcription maps suggest that two fragments may have transposed through DNA-based and one through RNA-based mechanisms. Analyses of the integration sites indicate that organellar DNA sequences are incorporated by an end-joining mechanism common to yeast, mammals, and plants. The transferred sequences also provide a novel perspective on rates and patterns of nucleotide substitution. Analysis of the D-loop region including a nuclear copy of mitochondrial DNA supports a progressive reduction in D-loop length within both monkey and great apes mitochondrial lineages. Relative distance tests polarized with nuclear copies of the mitochondrial 12S/16S rRNA region suggest that a constant number of transversions has accumulated within the great ape clade, but the number of transitions in orangutan is elevated with respect to members of the human/chimp/gorilla clade. In addition to DNA migration events, 29 nuclear/mitochondrial genes were identified in GenBank that appear to result from inadvertent ligation of nuclear and mitochondrial mRNA transcripts during the cloning process.
Article
Full-text available
Differential rates of nucleotide substitution among different gene segments and between distinct evolutionary lineages is well documented among mitochondrial genes and is likely a consequence of locus-specific selective constraints that delimit mutational divergence over evolutionary time. We compared sequence variation of 18 homologous loci (15 coding genes and 3 parts of the control region) among 10 mammalian mitochondrial DNA genomes which allowed us to describe different mitochondrial evolutionary patterns and to produce an estimation of the relative order of gene divergence. The relative rates of divergence of mitochondrial DNA genes in the family Felidae were estimated by comparing their divergence from homologous counterpart genes included in nuclear mitochondrial DNA (Numt, pronounced "new might"), a genomic fossil that represents an ancient transfer of 7.9 kb of mitochondrial DNA to the nuclear genome of an ancestral species of the domestic cat (Felis catus). Phylogenetic analyses of mitochondrial (mtDNA) sequences with multiple outgroup species were conducted to date the ancestral node common to the Numt and the cytoplasmic (Cymt) mtDNA genes and to calibrate the rate of sequence divergence of mitochondrial genes relative to nuclear homologous counterparts. By setting the fastest substitution rate as strictly mutational, an empirical "selective retardation index" is computed to quantify the sum of all constraints, selective and otherwise, that limit sequence divergence of mitochondrial gene sequences over time.
Article
Full-text available
Nuclear-localized mtDNA pseudogenes might explain a recent report describing a heteroplasmic mtDNA molecule containing five linked missense mutations dispersed over the contiguous mtDNA CO1 and CO2 genes in Alzheimer's disease (AD) patients. To test this hypothesis, we have used the PCR primers utilized in the original report to amplify CO1 and CO2 sequences from two independent rho degrees (mtDNA-less) cell lines. CO1 and CO2 sequences amplified from both of the rho degrees cells, demonstrating that these sequences are also present in the human nuclear DNA. The nuclear pseudogene CO1 and CO2 sequences were then tested for each of the five "AD" missense mutations by restriction endonuclease site variant assays. All five mutations were found in the nuclear CO1 and CO2 PCR products from rho degrees cells, but none were found in the PCR products obtained from cells with normal mtDNA. Moreover, when the overlapping nuclear CO1 and CO2 PCR products were cloned and sequenced, all five missense mutations were found, as well as a linked synonymous mutation. Unlike the findings in the original report, an additional 32 base substitutions were found, including two in adjacent tRNAs and a two base pair deletion in the CO2 gene. Phylogenetic analysis of the nuclear CO1 and CO2 sequences revealed that they diverged from modern human mtDNAs early in hominid evolution about 770,000 years before present. These data would be consistent with the interpretation that the missense mutations proposed to cause AD may be the product of ancient mtDNA variants preserved as nuclear pseudogenes.
Article
Full-text available
A wide-ranging examination of plastid (pt)DNA sequence homologies within higher plant nuclear genomes (promiscuous DNA) was undertaken. Digestion with methylation-sensitive restriction enzymes and Southern analysis was used to distinguish plastid and nuclear DNA in order to assess the extent of variability of promiscuous sequences within and between plant species. Some species, such as Gossypium hirsutum (cotton), Nicotiana tabacum (tobacco), and Chenopodium quinoa, showed homogenity of these sequences, while intraspecific sequence variation was observed among different cultivars of Pisum sativum (pea), Hordeum vulgare (barley), and Triticum aestivum (wheat). Hypervariability of plastid sequence homologies was identified in the nuclear genomes of Spinacea oleracea (spinach) and Beta vulgaris (beet), in which individual plants were shown to possess a unique spectrum of nuclear sequences with ptDNA homology. This hypervariability apparently extended to somatic variation in B. vulgaris. No sequences with ptDNA homology were identified by this method in the nuclear genome of Arabidopsis thaliana.
Article
Full-text available
To estimate patterns of molecular evolution of unconstrained DNA sequences, we used maximum parsimony to separate phylogenetic trees of a non-long terminal repeat retrotransposable element into either internal branches, representing mainly the constrained evolution of active lineages, or into terminal branches, representing mainly nonfunctional "dead-on-arrival" copies that are unconstrained by selection and evolve as pseudogenes. The pattern of nucleotide substitutions in unconstrained sequences is expected to be congruent with the pattern of point mutation. We examined the retrotransposon Helena in the Drosophila virilis species group (subgenus Drosophila) and the Drosophila melanogaster species subgroup (subgenus Sophophora). The patterns of point mutation are indistinguishable, suggesting considerable stability over evolutionary time (40-60 million years). The relative frequencies of different point mutations are unequal, but the "transition bias" results largely from an approximately 2-fold excess of G.C to A.T substitutions. Spontaneous mutation is biased toward A.T base pairs, with an expected mutational equilibrium of approximately 65% A + T (quite similar to that of long introns). These data also enable the first detailed comparison of patterns of point mutations in Drosophila and mammals. Although the patterns are different, all of the statistical significance comes from a much greater rate of G.C to A.T substitution in mammals, probably because of methylated cytosine "hotspots." When the G.C to A.T substitutions are discounted, the remaining differences are considerably reduced and not statistically significant.
Article
There is growing evidence that the integration of mitochondrial DNA sequences into nuclear and chloroplast genomes of higher organisms may be widespread rather than exceptional. We report the localization of 18S-25S rDNA and mitochondrial DNA sequences to meiotic chromosomes of several orthopteran species using in situ hybridisation. The cytochrome oxidase I (COI) sequence localizes to the centromeric and two telomeric regions of the eight bivalents of Chorthippus parallelus, the telomeric regions in Schistocerca gregaria and is present throughout the genome of Italopodisma sp. (Orthoptera: Acrididae). The control region of the mitochondrion and COI localize to similar chromosomal regions in S. gregaria. These data explain sequencing data that are inconsistent with the COI sequence being solely mitochondrial. The different nuclear locations of mtDNA in the different genera studied suggest that grasshopper mtDNA-like sequences have been inserted into the nuclear genome more than once in Acridid history, and there may have been different mechanisms involved when these events occurred in each of these species.Key words: Schistocerca gregaria, Italopodisma spp., Chorthippus parallelus, in situ hybridisation, mitochondrial DNA, genome organization.
Article
To better understand the organization of the genome of bats, we examined by in situ hybridization, the number of ribosomal DNA sites in 50 species of bats representing both suborders, 7 families, and 38 genera. Number of sites ranged from one to four pairs (average, 1.76) in bats, whereas the number of sites in 40 species of rodents ranged from two to ten pairs (average, 4.19). The possible relationship of a reduced number of sites to a smaller amount of DNA in the genome of bats is explored. We find little evidence to support the hypothesis that bats are retaining a fixed primitive condition of a low number of sites and we conclude that the most probable explanation is that bats, like other groups of mammals, have mechanisms that tend to increase the number of sites. However, the balance between mechanisms that increase and those that reduce the number of sites is more strongly in favor of reduction of sites than is characteristic of other mammals such as rodents.
Article
Many mitochondrial and plastid proteins are derived from their bacterial endosymbiotic ancestors, but their genes now reside on nuclear chromosomes instead of remaining within the organelle. To become an active nuclear gene and return to the organelle as a functional protein, an organellar gene must first be assimilated into the nuclear genome. The gene must then be transcribed and acquire a transit sequence for targeting the protein back to the organelle. On reaching the organelle, the protein must be properly folded and modified, and in many cases assembled in an orderly manner into a larger protein complex. Finally, the nuclear copy must be properly regulated to achieve a fitness level comparable with the organellar gene. Given the complexity in establishing a nuclear copy, why do organellar genes end up in the nucleus? Recent data suggest that these genes are worse off than their nuclear and free-living counterparts because of a reduction in the efficiency of natural selection, but do these population–genetic processes drive the movement of genes to the nucleus? We are now at a stage where we can begin to discriminate between competing hypotheses using a combination of experimental, natural population, bioinformatic and theoretical approaches.
Article
Chromosomal double-strand breaks (DSBs) can be repaired by either homology-dependent or homology-independent pathways. Using a novel intron-based genetic assay to identify rare homology-independent DNA rearrangements associated with repair of a chromosomal DSB in S. cerevisiae, we observed that approximately 20% of rearrangements involved endogenous DNA insertions at the break site. We have analyzed 37 inserts and find they fall into two distinct classes: Ty1 cDNA intermediates varying in length from 140 bp to 3.4 kb and short mitochondrial DNA fragments ranging in size from 33 bp to 219 bp. Several inserts consist of multiple noncontiguous mitochondrial DNA segments. These results demonstrate an ongoing mechanism for genome evolution through acquisition of organellar and mobile DNAs at DSB sites.
Article
Animal mitochondrial DNA has proved a valuable marker in intraspecific systematic studies. However, if nucleotide sequence heterogeneity exists at the individual level, its usefulness will be much reduced. This study demonstrates that the presence of highly conserved non-coding mitochondrial sequences in the nuclear genome of Schistocerca gregaria greatly impairs the use of mtDNA in population genetic studies. Caution is called for in other organisms; and it seems necessary to check for conserved nuclear copies of mitochondrial sequences before launching into a large scale analysis of populations using mtDNA as a genetic marker. Experimental procedures are suggested for this purpose.
Article
The combined use of mitochondrial DNA markers and polymerase chain reaction (PCR) techniques has greatly enhanced evolutionary studies. These techniques have also promoted the discovery of mitochondrial-like sequences in the nuclear genomes of many animals. While the nuclear sequences themselves are interesting, and capable of serving as valuable molecular tools, they can also confound phylogenetic and population genetic studies. Clearly, a better understanding of these phenomena and vigilance towards misleading data are needed.
Article
This version has been published by NRC Canada http://pubs.nrc-cnrc.gc.ca/ There is growing evidence that the integration of mitochondrial DNA sequences into nuclear and chloroplast genomes of higher organisms may be widespread rather than exceptional. We report the localization of 18S–25S rDNA and mitochondrial DNA sequences to meiotic chromosomes of several orthopteran species using in situ hybridisation. The cytochrome oxidase I (COI) sequence localizes to the centromeric and two telomeric regions of the eight bivalents of Chorthippus parallelus, the telomeric regions in Schistocerca gregaria and is present throughout the genome of Italopodisma sp. (Orthoptera: Acrididae). The control region of the mitochondrion and COI localize to similar chromosomal regions in S. gregaria. These data explain sequencing data that are inconsistent with the COI sequence being solely mitochondrial. The different nuclear locations of mtDNA in the different genera studied suggest that grasshopper mtDNA-like sequences have been inserted into the nuclear genome more than once in Acridid history, and there may have been different mechanisms involved when these events occurred in each of these species.
Article
By using the polymerase chain reaction to amplify and sequence 178 bp of a rapidly evolving region of the mtDNA genome (segment I of the control region) from 81 individuals, approximately 11% of the variation present in the lesser snow goose Chen caerulescens caerulescens L. mitochondrial genome was surveyed. The 26 types of mtDNA detected formed two distinct mitochondrial clades that differ by an average of 6.7% and are distributed across the species range. Restriction analysis of amplified fragments was then used to assign the mtDNA of an additional 29 individuals to either of these clades. Within one major clade, sequence among mtDNAs was concordant with geographic location. Within the other major clade the degree of sequence divergence among haplotypes was lower and no consistent geographic structuring was evident. The two major clades presumably result from vicariant separation of lesser snow geese during the Pleistocene.
Article
The relative rates of point nucleotide substitution and accumulation of gap events (deletions and insertions) were calculated for 22 human and 30 rodent processed pseudogenes. Deletion events not only outnumbered insertions (the ratio being 7:1 and 3:1 for human and rodent pseudogenes, respectively), but also the total length of deletions was greater than that of insertions. Compared with their functional homologs, human processed pseudogenes were found to be shorter by about 1.2%, and rodent pseudogenes by about 2.3%. DNA loss from processed pseudogenes through deletion is estimated to be at least seven times faster in rodents than in humans. In comparison with the rate of point substitutions, the abridgment of pseudogenes during evolutionary times is a slow process that probably does not retard the rate of growth of the genome due to the proliferation of processed pseudogenes.
Article
Analysis of the rate of nucleotide substitution at silent sites in Drosophila genes reveals three main points. First, the silent rate varies (by a factor of two) among nuclear genes; it is inversely related to the degree of codon usage bias, and so selection among synonymous codons appears to constrain the rate of silent substitution in some genes. Second, mitochondrial genes may have evolved only as fast as nuclear genes with weak codon usage bias (and two times faster than nuclear genes with high codon usage bias); this is quite different from the situation in mammals where mitochondrial genes evolve approximately 5-10 times faster than nuclear genes. Third, the absolute rate of substitution at silent sites in nuclear genes in Drosophila is about three times higher than the average silent rate in mammals.
Article
Using specific probes we show that sequences homologous to NADH dehydrogenase Subunit 6, and Cytochrome oxidase Subunits I, II, and III mitochondrial genes are present in nuclear DNA from various tissues. These mitochondrial-like sequences are also present in rat hepatoma nuclear DNA but with an abnormal organization and a higher copy number than in normal hepatocytes.
Article
Thirty-three phage clones carrying DNAs homologous to human mitochondrial DNA (mtDNA) were isolated from two independently constructed human gene libraries, and the region and extent of homology of mtDNA-like sequences carried by these clones were examined in hybridization experiments. Each phage clone contained DNA sequences homologous to various parts of the mtDNA and the extent of homology differed from clone to clone. From the efficiency of the library screening, it was estimated that human nuclear DNA contains at least several hundred copies of mtDNA-like fragments. Four clones carrying nuclear DNA sequences homologous to the mitochondrial Unidentified Reading Frame (URF) 4 and URF5 regions were chosen for further studies, and their structures were analyzed by DNA sequencing. Comparison of these mtDNA-like sequences with that of mtDNAs of several mammalian species revealed conservation of a part of the structures present in direct ancestral mtDNAs. The mtDNA fragments seem to have been continuously integrated into mammalian nuclear DNA during evolution.
Article
Substitution rates in pseudogenes can be used to estimate the frequencies of different types of mutation on the assumption that pseudogenes are not subject to selective constraints. These rates are used here to investigate the effect of neighboring bases on mutation rates. There is a marked increase in the frequency of transitions, though not of transversions, from the doublet CG. There are also some smaller effects of neighboring bases on the frequencies of transitions from adenine and thymine. The results are used to predict dinucleotide frequencies in a stretch of DNA subject to no selective constraints and to investigate the possibility of non-randomness in the usage of stop codons.
Article
On the neutral mutation hypothesis, the rate of nucleotide substitution is expected to be higher for functionally less important genes or parts of genes than for functionally more important genes, as the latter would be subject to stronger purifying (negative) selectio. On the other hand, selectionists believe that most nucleotide substitutions are caused by positive darwinian selection, in which case the rate of nucleotide substitution in functionally unimportant genes or parts of genes is expected to be relatively lower because the mutations in these regions of DNA would not produce any significant selective advantages. Kimura and Jukes have argued that the higher substitution rate observed at the third positions of codons than at the first two positions supports the neutral mutation hypothesis, as most third-position substitutions are synonymous and do not change the amino acids encoded, although others have discussed the possibility that third-position substitutions are subject to positive darwinian selection. Recently, Kimura noted that the mouse globin pseudogene, psi alpha 3, evolved faster than the normal mouse alpha 1 gene, although he did not compute the substitution rate. Here, we present a method of computing the rate of nucleotide substitution for pseudogenes, and report that the three recently discovered pseudogenes show an extremely high rate of nucleotide substitution. As these pseudogenes apparently have no function, this finding strongly supports the neutral mutation hypothesis.
Article
Mammalian mitochondrial DNA sequences evolve more rapidly than nuclear sequences. Although the rapid rate of evolution is an advantage for the study of closely related species and populations, it presents a problem in situations where related species, used as outgroups in phylogenetic analyses, have accumulated so much change that multiple substitutions obliterate the phylogenetic information. However, mitochondrial DNA sequences are frequently inserted into the nuclear genome, where they presumably evolve as nuclear pseudogene sequences and therefore more slowly than their mitochondrial counterparts. Such sequences thus represent molecular 'fossils' that could shed light on the evolution of the mitochondrial genome and could be used as outgroups in situations where no appropriate outgroup species exist. Here we show that human chromosome 11 carries a recent integration of the mitochondrial control region that can be used to gain further insight into the origin of the human mitochondrial gene pool.
Article
Using oligonucleotide primers designed to match conserved regions of mammalian mitochondrial DNA (mtDNA), we have amplified and sequenced two divergent cytochrome b nuclear pseudogenes from orangutan cellular DNA. Evolutionary analysis suggests that a nuclear transfer occurred about 30 million years ago on the lineage leading to the catarrhines (Old World monkeys and hominoids), and involved a long (at least 3 kilobases), probably damaged, piece of mtDNA. After this transfer, the pseudogene duplicated, giving rise to the two copies that are probably present in all hominoids, including humans. More recent transfers involving the entire cytochrome b gene have also occurred in the Old World monkeys. Such nuclear copies of mtDNA can confound phylogenetic and population genetic studies, and be an insidious source of DNA contamination of 'ancient' and forensic DNA. Indeed, contamination with these anciently transferred human pseudogenes is almost certainly the source of the cytochrome b sequences recently reported from 'dinosaur bone DNA'.
Article
Nuclear copies of mitochondrial genes have been reported several times. Presented here is a direct comparison of a fragment of the mitochondrial gene coding for Cytochrome b and its assumed nuclear pseudogene in a phylogenetic context. By studying eight such sets of genes a direct measurement of relative rates of several types of substitutions were made. As expected mitochondrial third position transitions are the fastest accumulating substitutions, here indicated to be at least up to 39 times faster than corresponding positions in the supposed nuclear pseudogene. Translocated mitochondrial genes, evolving much slower than their functional 'counterpart', reflect the ancestral-pre-translocated form of the gene. A warning is given against unwanted inclusion of paralogous sequences in phylogenetic analysis and against the use of versatile primers that can promote such incidents.
Article
Monkey mummy bones and teeth originating from the North Saqqara Baboon Galleries (Egypt), soft tissue from a mummified baboon in a museum collection, and nineteenth/twentieth-century skin fragments from mangabeys were used for DNA extraction and PCR amplification of part of the mitochondrial 12S rRNA gene. Sequences aligning with the 12S rRNA gene were recovered but were only distantly related to contemporary monkey mitochondrial 12S rRNA sequences. However, many of these sequences were identical or closely related to human nuclear DNA sequences resembling mitochondrial 12S rRNA (isolated from a cell line depleted in mitochondria) and therefore have to be considered contamination. Subsequently in a separate study we were able to recover genuine mitochondrial 12S rRNA sequences from many extant species of nonhuman Old World primates and sequences closely resembling the human nuclear integrations. Analysis of all sequences by the neighbor-joining (NJ) method indicated that mitochondrial DNA sequences and their nuclear counterparts can be divided into two distinct clusters. One cluster contained all temporary cytoplasmic mitochondrial DNA sequences and approximately half of the monkey nuclear mitochondriallike sequences. A second cluster contained most human nuclear sequences and the other half of monkey nuclear sequences with a separate branch leading to human and gorilla mitochondrial and nuclear sequences. Sequences recovered from ancient materials were equally divided between the two clusters. These results constitute a warning for when working with ancient DNA or performing phylogenetic analysis using mitochondrial DNA as a target sequence: Nuclear counterparts of mitochondrial genes may lead to faulty interpretation of results.
Article
In the course of studies on mutations in human mitochondrial (mt) DNA, we have uncovered and sequenced four new nuclear pseudogenes corresponding to bp 2457-2657 of the mt 16S rDNA. The four genes and their homologies with human mtDNA are E2 (62.4%), K10 (74.4%), E1 (84.6%) and LE6 (93.2%). When these five pseudogene sequences and another previously reported pseudogene sequence are compared with each other, they display what appears to be an ordered series of steps from a hypothetical common ancestor. The sequence of the hypothetical ancestor closely resembles that found in a wide variety of present-day mammalian mt genomes. The pseudogene sequences suggest an evolutionary trail of mt mutation dominated by base pair transitions punctuated by integration into the nuclear genome. Once integrated into the nuclear genome, the pseudogenes appear to follow the distinctive nuclear mutational pathway in which GC to AT transitions predominate and CpG sequences are preferentially eliminated.
Article
The nuclear genes of Drosophila evolve at various rates. This variation seems to correlate with codon-usage bias. In order to elucidate the determining factors of the various evolutionary rates and codon-usage bias in the Drosophila nuclear genome, we compared patterns of codon-usage bias with base compositions of exons and introns. Our results clearly show the existence of selective constraints at the translational level for synonymous (silent) sites and, on the other hand, the neutrality or near neutrality of long stretches of nucleotide sequence within noncoding regions. These features were found for comparisons among nuclear genes in a particular species (Drosophila melanogaster, Drosophila pseudoobscura and Drosophila virilis) as well as in a particular gene (alcohol dehydrogenase) among different species in the genus Drosophila. The patterns of evolution of synonymous sites in Drosophila are more similar to those in the prokaryotes than they are to those in mammals. If a difference in the level of expression of each gene is a main reason for the difference in the degree of selective constraint, the evolution of synonymous sites of Drosophila genes would be sensitive to the level of expression among genes and would change as the level of expression becomes altered in different species. Our analysis verifies these predictions and also identifies additional selective constraints at the translational level in Drosophila.
Article
Four nuclear pseudogenes homologous to the 10031-10195-bp region of the human mitochondrial genome were detected by constant denaturant capillary electrophoresis. Among them, one pseudogene is present as at least five copies in each cell, in accordance with our previous observations of multi-copy mitochondrial DNA pseudogenes. The presence of multiple identical copies of pseudogenes suggests that the human genome underwent a series of genetic changes, including gene amplifications, very recently in evolutionary history, i.e., within the last 390000 years.
Article
Simple sequences present in long (> 30 kb) sequences representative of the single-copy genome of five species (Homo sapiens, Caenorhabditis elegans, Saccharomyces cerevisiae, E. coli, and Mycobacterium leprae) have been analyzed. A close relationship was observed between genome size and the overall level of sequence repetition. This suggested that the incorporation of simple sequences had accompanied increases of genome size during evolution. Densities of simple sequence motifs were higher in noncoding regions than in coding regions in eukaryotes but not in eubacteria. All five genomes showed very biased frequency distributions of simple sequence motifs in all species, particularly in eukaryotes where AAA and TTT predominated. Interspecific comparisons showed that noncoding sequences in eukaryotes showed highly significantly similar frequency distributions of simple sequence motifs but this was not true of coding sequences. ANOVA of the frequency distributions of simple sequence motifs indicated strong contributions from motif base composition and repeat unit length, but much of the variation remained unexplained by these parameters. The sequence composition of simple sequences therefore appears to reflect both underlying sequence biases in slippage-like processes and the action of selection. Frequency distributions of simple sequence motifs in coding sequences correlated weakly or not at all with those in noncoding sequences. Selection on coding sequences to eliminate undesirable sequences may therefore have been strong, particularly in the human lineage.
Article
The nuclear genomes of many animals contain non-functional copies of mitochondrial genes that provide new opportunities for evolutionary analysis.
Article
Animal mitochondrial DNA has proved a valuable marker in intraspecific systematic studies. However, if nucleotide sequence heterogeneity exists at the individual level, its usefulness will be much reduced. The study demonstrates that the presence of highly conserved non-coding mitochondrial sequences in the nuclear genome of Schistocerca gregaria greatly impairs the use of mtDNA in population genetic studies. Caution is called for in other organisms; and it seems necessary to check for conserved nuclear copies of mitochondrial sequences before launching into a large scale analysis of populations using mtDNA as a genetic marker. Experimental procedures are suggested for this purpose.
Article
The escape and migration of genetic information between mitochondria, chloroplasts, and nuclei have been an integral part of evolution and has a continuing impact on the biology of cells. The evolutionary transfer of functional genes and fragments of genes from chloroplasts to mitochondria, from chloroplasts to nuclei, and from mitochondria to nuclei has been documented for numerous organisms. Most documented instances of genetic material transfer have involved the transfer of information from mitochondria and chloroplasts to the nucleus. The pathways for the escape of DNA from organelles may include transient breaches in organellar membranes during fusion and/or budding processes, terminal degradation of organelles by autophagy coupled with the subsequent release of nucleic acids to the cytoplasm, illicit use of nucleic acid or protein import machinery, or fusion between heterotypic membranes. Some or all of these pathways may lead to the escape of DNA or RNA from organellar compartments with subsequent uptake of nucleic acids from the cytoplasm into the nucleus. Investigations into the escape of DNA from mitochondria in yeast have shown the rate of escape for gene-sized fragments of DNA from mitochondria and its subsequent migration to the nucleus to be roughly equivalent to the rate of spontaneous mutation of nuclear genes. Smaller fragments of mitochondrial DNA may appear in the nucleus even more frequently. Mutations of nuclear genes that define gene products important in controlling the rate of DNA escape from mitochondria in yeast also have been described. The escape of genetic material from mitochondria and chloroplasts has clearly had an impact on nuclear genetic organization throughout evolution and may also affect cellular metabolic processes.
Article
The frequency of a polymorphic mitochondrial DNA insertion into the nuclear genome was determined for 870 individuals from a geographically diverse set of 20 populations. The mtDNA insertion frequency varies significantly among populations, having a large GST value (0.178) and high heterozygosity values within populations. The clinal pattern of increasing frequency of the insertion from Africans through Europeans and Asians to native Americans is striking. The polymorphism is a new example of insertion-deletion polymorphisms and is a valuable marker for human population and evolutionary studies.
Article
The mitochondrial genome of 23 Arabidopsis thaliana ecotypes was analysed by Southern hybridization in total cellular DNA. Firstly, the extent of divergence between the mitochondrial genomes in closely related lines of one plant species and secondly, the use of mitochondrial versus nuclear RFLPs to determine evolutionary relationships between Arabidopsis ecotype isolates was investigated. Highly divergent stoichiometries of alternative mitochondrial genome arrangements characterize individual ecotypes including the complete loss of a 5 kb region from ecotype Landsberg without apparent effect on plant viability. The genetic similarities between ecotypes suggested by mitochondrial genome arrangements differ from those deduced from 18 nuclear RFLP loci (CAPS markers). Similarity of nuclear RFLP patterns among the 23 Arabidopsis ecotypes neither correlates with their geographic origin nor with the observed mitochondrial genome arrangements. A promiscuous mitochondrial sequence insertion previously identified in ecotype Columbia is also found in the nuclear genomes of ecotypes Eifel, Enkheim and Hilversum. Two ecotypes (Eifel and Tabor) displaying identical RFLP patterns at all 18 nuclear loci show differences in both this sequence transfer and a mitochondrial DNA recombination event.
Article
The nuclear DNA of normal and tumor mouse and rat tissue was examined for mitochondrial-DNA-like inserts by means of the Southern blot technique. The two probes were 32P-labeled cloned mitochondrial DNA. KpnI, which doesn't cut either mitochondrial DNA, was one of the restriction enzymes, while the enzymes that fragment mitochondrial DNA were for mouse and rat PstI and BamHI, respectively. When KpnI alone was used in the procedure a nuclear LINE family whose elements had mitochondrial-DNA-like insertions was selected. Such elements were much more abundant in tumor than in normal tissue. The results with PstI alone and BamHI alone and each combined with KpnI indicated that there were mobile LINE elements with mitochondrial-DNA-like inserts in the nuclear genome of tumor. The mouse tissues were normal liver and a transplantable lymphoid leukemic ascites cell line L1210 that had been carried for 40 years. The rat tissues were normal liver and a hepatoma freshly induced by diethylnitrosoamine in order to minimize the role of 40 years of transplantation. Our unitary hypothesis for carcinogenesis of 1971, which suggested these experiments, has been augmented to include mobile nuclear elements with inserts of mitochondrial-DNA-like sequences. Such elements have been related to diseases of genetic predisposition such as breast cancer and Huntington's disease.
Article
Recent phylogenetic analyses reveal that many eukaryotic nuclear genes whose prokaryotic ancestry can be pinned down are of bacterial origin. Among them are genes whose products function exclusively in cytosolic metabolism. The results are surprising: we had come to believe that the eukaryotic nuclear genome shares a most recent common ancestor with archaeal genomes, thus most of its gene should be 'archaeal' (loosely speaking). Some genes of bacterial origin were expected as the result of transfer from mitochondria, of course, but these were thought to be relatively few, and limited to producing proteins reimported into mitochondria. Here, I suggest that the presence of many bacterial genes with many kinds of functions should not be a surprise. The operation of a gene transfer ratchet would inevitably result in the replacement of nuclear genes of early eukaryotes by genes from the bacteria taken by them as food.
Article
Hair has become a widely used source of DNA in population genetics, forensics, and conservation biology. Here were report that PCR primers that amplify a segment of the mitochondrial control region from blood DNA amplify primarily integrated nuclear copies of mitochondrial DNA from hair DNA. Thus, in some species, and under some circumstances, DNA from hair may yield unreliable results.
Article
A full-length cytochrome b pseudogene was found in rodents; it has apparently been translocated from a mitochondrion to the nuclear genome in the subfamily Arvicolinae. The pseudogene (psi cytb) differed from its mitochondrial counterpart at 201 of 1143 sites (17.6%) and by four indels. Cumulative evidence suggests that the pseudogene has been translocated to the nucleus. Phylogenetic reconstruction indicates that the pseudogene arose before the diversification of M. arvalis/M. rossiaemeridionalis from M. oeconomus, but after the divergence of the peromyscine/sigmodontine/ arvicoline clades some approximately 10 MYA. Published rates of divergence between mitochondrial genes and their nuclear pseudogenes suggest that the translocation of this mitochondrial gene to the nuclear genome occurred some 6 MYA, in agreement with the phylogenetic evidence.
Article
Constant denaturant capillary electrophoresis (CDCE) permits high-resolution separation of single-base variations occurring in an approximately 100 bp isomelting DNA sequence based on their differential melting temperatures. By coupling CDCE for highly efficient enrichment of mutants with high-fidelity polymerase chain reaction (hifi PCR), we have developed an analytical approach to detecting point mutations at frequencies equal to or greater than 10(-6) in human genomic DNA. In this article, we present several applications of this approach in human genetic studies. We have measured the point mutational spectra of a 100 bp mitochondrial DNA sequence in human tissues and cultured cells. The observations have led to the conclusion that the primary causes of mutation in human mitochondrial DNA are spontaneous in origin. In the course of studying the mitochondrial somatic mutations, we have also identified several nuclear pseudogenes homologous to the analyzed mitochondrial DNA fragment. Recently, through developments of the means to isolate the desired target sequences from bulk genomic DNA and to increase the loading capacity of CDCE, we have extended the CDCE/hifi PCR approach to study a chemically induced mutational spectrum in a single-copy nuclear sequence. Future applications of the CDCE/hifi PCR approach to human genetic analysis include studies of somatic mitochondrial mutations with respect to aging, measurement of mutational spectra of nuclear genes in healthy human tissues and population screening for disease-associated single nucleotide polymorphisms (SNPs) in large pooled samples.
Article
The transfer of organelle nucleic acid to the nucleus has been observed in both plants and animals. Using a unique assay to monitor mitochondrial DNA escape to the nucleus in the yeast Saccharomyces cerevisiae, we previously showed that mutations in several nuclear genes, collectively called yme mutants, cause a high rate of mitochondrial DNA escape to the nucleus. Here we demonstrate that mtDNA escape occurs via an intracellular mechanism that is dependent on the composition of the growth medium and the genetic state of the mitochondrial genome, and is independent of an RNA intermediate. Isolation of several unique second-site suppressors of the high rate of mitochondrial DNA-escape phenotype of yme mutants suggests that there are multiple independent pathways by which this nucleic acid transfer occurs. We also demonstrate that the presence of centromeric plasmids in the nucleus can reduce the perceived rate of DNA escape from the mitochondria. We propose that mitochondrial DNA-escape events are manifested as unstable nuclear plasmids that can interact with centromeric plasmids resulting in a decrease in the number of observed events.
Article
The endosymbiotic theory for the origin of eukaryotic cells proposes that genetic information can be transferred from mitochondria to the nucleus of a cell, and genes that are probably of mitochondrial origin have been found in nuclear chromosomes. Occasionally, short or rearranged sequences homologous to mitochondrial DNA are seen in the chromosomes of different organisms including yeast, plants and humans. Here we report a mechanism by which fragments of mitochondrial DNA, in single or tandem array, are transferred to yeast chromosomes under natural conditions during the repair of double-strand breaks in haploid mitotic cells. These repair insertions originate from noncontiguous regions of the mitochondrial genome. Our analysis of the Saccharomyces cerevisiae mitochondrial genome indicates that the yeast nuclear genome does indeed contain several short sequences of mitochondrial origin which are similar in size and composition to those that repair double-strand breaks. These sequences are located predominantly in non-coding regions of the chromosomes, frequently in the vicinity of retrotransposon long terminal repeats, and appear as recent integration events. Thus, colonization of the yeast genome by mitochondrial DNA is an ongoing process.