Fidelity of capture-enrichment for mtDNA genome sequencing: influence of NUMTs.

Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Anthropology, D04103, Leipzig, Germany.
Nucleic Acids Research (Impact Factor: 8.81). 05/2012; 40(18):e137. DOI: 10.1093/nar/gks499
Source: PubMed

ABSTRACT Enriching target sequences in sequencing libraries via capture hybridization to bait/probes is an efficient means of leveraging the capabilities of next-generation sequencing for obtaining sequence data from target regions of interest. However, homologous sequences from non-target regions may also be enriched by such methods. Here we investigate the fidelity of capture enrichment for complete mitochondrial DNA (mtDNA) genome sequencing by analyzing sequence data for nuclear copies of mtDNA (NUMTs). Using capture-enriched sequencing data from a mitochondria-free cell line and the parental cell line, and from samples previously sequenced from long-range PCR products, we demonstrate that NUMT alleles are indeed present in capture-enriched sequence data, but at low enough levels to not influence calling the authentic mtDNA genome sequence. However, distinguishing NUMT alleles from true low-level mutations (e.g. heteroplasmy) is more challenging. We develop here a computational method to distinguish NUMT alleles from heteroplasmies, using sequence data from artificial mixtures to optimize the method.

  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Next-generation sequencing, also known as high-throughput sequencing, has greatly enhanced researchers' ability to conduct biomedical research on all levels. Mitochondrial research has also benefitted greatly from high-throughput sequencing; sequencing technology now allows for screening of all 16569 base pairs of the mitochondrial genome simultaneously for SNPs and low level heteroplasmy and, in some cases, the estimation of mitochondrial DNA copy number. It is important to realize the full potential of high-throughput sequencing for the advancement of mitochondrial research. To this end, we review how high-throughput sequencing has impacted mitochondrial research in the categories of SNPs, low level heteroplasmy, copy number, and structural variants. We also discuss the different types of mitochondrial DNA sequencing and their pros and cons. Based on previous studies conducted by various groups, we provide strategies for processing mitochondrial DNA sequencing data, including assembly, variant calling, and quality control.
    Mitochondrion 05/2014; · 3.52 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The last 30years of research greatly contributed to shed light on the role of mitochondrial DNA (mtDNA) variability in aging, although contrasting results have been reported, mainly due to bias regarding the population size and stratification, and to the use of analysis methods (haplogroup classification) that resulted to be not sufficiently adequate to grasp the complexity of the phenomenon. A 5-years European study (the GEHA EU project) collected and analysed data on mtDNA variability on an unprecedented number of long-living subjects (enriched for longevity genes) and a comparable number of controls (matched for gender and ethnicity) in Europe. This very large study allowed a reappraisal of the role of both the inherited and the somatic mtDNA variability in aging, as an association with longevity emerged only when mtDNA variants in OXPHOS complexes co-occurred. Moreover, the availability of data from both nuclear and mitochondrial genomes on a large number of subjects paves the way for an evaluation at a very large scale of the epistatic interactions at a higher level of complexity. This scenario is expected to be even more clarified in the next future with the use of next generation sequencing (NGS) techniques, which are becoming applicable to evaluate mtDNA variability and, then, new mathematical/bioinformatic analysis methods are urgently needed. Recent advances of association studies on age-related diseases and mtDNA variability will be also discussed in this review, taking into account the bias hidden by population stratification. Finally very recent findings in terms of mtDNA heteroplasmy (i.e. the coexistence of wild type and mutated copies of mtDNA) and aging as well as mitochondrial epigenetic mechanisms will be also discussed.
    Experimental gerontology 04/2014; · 3.34 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Nuclear mitochondrial pseudogenes (numts) are non-functional fragments of mtDNA inserted into the nuclear genome. Numts are prevalent across eukaryotes and a positive correlation is known to exist between the number of numts and the genome size. Most numt surveys have relied on model organisms with fully sequenced nuclear genomes, but such analyses have limited utilities for making a generalization about the patterns of numt accumulation for any given clade. Among insects, the order Orthoptera is known to have the largest nuclear genome and it is also reported to include several species with a large number of numts. In this study, we use Orthoptera as a case study to document the diversity and abundance of numts by generating numts of three mitochondrial loci across 28 orthopteran families, representing the phylogenetic diversity of the order. We discover that numts are rampant in all lineages, but there is no discernable and consistent pattern of numt accumulation among different lineages. Likewise, we do not find any evidence that a certain mitochondrial gene is more prone to nuclear insertion than others. We also find that numt insertion must have occurred continuously and frequently throughout the diversification of Orthoptera. Although most numts are the result of recent nuclear insertion, we find evidence of very ancient numt insertion shared by highly divergent families dating back to the Jurassic period. Finally, we discuss several factors contributing to the extreme prevalence of numts in Orthoptera and highlight the importance of exploring the utility of numts in evolutionary studies.
    PLoS ONE 10/2014; 9(10):e110508. · 3.53 Impact Factor


Available from