HAPLOFIND: a new method for high-throughput mtDNA haplogroup assignment

Department of Experimental, Diagnostic and Specialty Medicine, University of Bologna, 40126, Bologna, Italy
Human Mutation (Impact Factor: 5.14). 09/2013; DOI: 10.1002/humu.22356
Source: PubMed


Deep sequencing technologies are completely revolutionizing the approach to DNA analysis. Mitochondrial DNA (mtDNA) studies entered in the "post-genomic era": the burst in sequenced samples observed in nuclear genomics is expected also in mitochondria, a trend that can already be detected checking complete mtDNA sequences database submission rate. Tools for the analysis of these data are available, but they fail in throughput or in easiness of use. We present here a new pipeline based on previous algorithms, inherited from the "nuclear genomic toolbox", combined with a newly developed algorithm capable of efficiently and easily classify new mtDNA sequences according to PhyloTree nomenclature. Detected mutations are also annotated using data collected from publicly available databases. Thanks to the analysis of all freely available sequences with known haplogroup obtained from GenBank, we were able to produce a Phylotree-based weighted tree, taking into account each haplogroup pattern conservation. The combination of a highly-efficient aligner, coupled with our algorithm and a massive usage of asynchronous parallel processing, allowed us to build a high-throughput pipeline for the analysis of mitochondrial DNA sequences, that can quickly be updated to follow the ever-changing nomenclature. HaploFind is freely accessible at the web address This article is protected by copyright. All rights reserved.

26 Reads
  • Source
    • "With these final settings we obtain an overall call rate of 0.9999, with the call rate per sample ranging from 0.999 and 1. Finally, haplogroups were inferred with HaploFind (Vianello et al. 2013), that uses haplogroup definitions reported by Phylotree (van Oven and Kayser 2009). "

  • Source
    • "By evaluating haplogroup assignments, an analyst may identify potential errors a posterori. These assignments may be done manually or using software applications [27] [28] [29] with debatable success [26]. Software applications, such as HaploGrep, have proven to be successful and allow haplogroup generation for 1000s of samples at a time making it ideal for analysis of large population data sets. "
    [Show abstract] [Hide abstract]
    ABSTRACT: The mitochondrial genome (mtGenome) contains genetic information amenable to numerous applications such as medical research, population and evolutionary studies, and human identity testing. However, inconsistent nomenclature assignment makes haplotype comparison difficult and can lead to false exclusion of potentially useful profiles. Massively Parallel Sequencing (MPS) is a platform for sequencing large datasets and potentially whole populations with relative ease. However, the data generated are not easily parsed and interpreted. With this in mind, mitoSAVE has been developed to enable fast conversion of Variant Call Format (VCF) files. mitoSAVE is an Excel-based workbook that converts data within the VCF into mtDNA haplotypes using phylogenetically-established nomenclature as well as rule-based alignments consistent with current forensic standards. mitoSAVE is formatted for human mitochondrial genome; however, it can easily be adapted to support other reasonably small genomes.
    Forensic Science International: Genetics 06/2014; 12C:122-125. DOI:10.1016/j.fsigen.2014.05.013 · 4.60 Impact Factor
  • Source
    • "The study of epistatic interactions represents a challenge [126] that is gradually becoming more achievable with the availability of NGS techniques. Indeed, the significant drop in costs and time commitment required to obtain a complete mtDNA sequence caused a burst in the number of available samples in both private and public databases [127]. It is noteworthy that about 22% of the whole database of the complete human mtDNA sequences was deposited in the last 12 months. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Usually the genetics of human longevity is restricted to the nuclear genome (nDNA). However it is well known that the nDNA interacts with a physically and functionally separated genome, the mitochondrial DNA (mtDNA) that, even if limited in length and number of genes encoded, plays a major role in the ageing process. The complex interplay between nDNA/mtDNA and the environment is most likely involved in phenomena such as ageing and longevity. To this scenario we have to add another level of complexity represented by the microbiota, that is, the whole set of bacteria present in the different part of our body with their whole set of genes. In particular, several studies investigated the role of gut microbiota (GM) modifications in ageing and longevity and an age-related GM signature was found. In this view, human being must be considered as "metaorganism" and a more holistic approach is necessary to grasp the complex dynamics of the interaction between the environment and nDNA-mtDNA-GM of the host during ageing. In this review, the relationship between the three genetics and human longevity is addressed to point out that a comprehensive view will allow the researchers to properly address the complex interactions that occur during human lifespan.
    BioMed Research International 04/2014; 2014:560340. DOI:10.1155/2014/560340 · 2.71 Impact Factor
Show more

Similar Publications