Article

MITOMASTER: a bioinformatics tool for the analysis of mitochondrial DNA sequences

Department of Information and Computer Science, University of California, Irvine, Irvine, California 92697-3940, USA.
Human Mutation (Impact Factor: 5.05). 01/2009; 30(1):1-6. DOI: 10.1002/humu.20801
Source: PubMed

ABSTRACT We have developed a computer system, MITOMASTER, to make analysis of human mitochondrial DNA (mtDNA) sequences efficient, accurate, and easily available. From imported sequences, the system identifies nucleotide variants, determines the haplogroup, rules out possible pseudogene contamination, identifies novel DNA sequence variants, and evaluates the potential biological significance of each variant. This system should be beneficial for mtDNA analyses of biomedical physicians and investigators, population biologists and forensic scientists. MITOMASTER can be accessed at http://mammag.web.uci.edu/twiki/bin/view/Mitomaster.

Download full-text

Full-text

Available from: Dan Mishmar, Oct 13, 2014
0 Followers
 · 
146 Views
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The last 30years of research greatly contributed to shed light on the role of mitochondrial DNA (mtDNA) variability in aging, although contrasting results have been reported, mainly due to bias regarding the population size and stratification, and to the use of analysis methods (haplogroup classification) that resulted to be not sufficiently adequate to grasp the complexity of the phenomenon. A 5-years European study (the GEHA EU project) collected and analysed data on mtDNA variability on an unprecedented number of long-living subjects (enriched for longevity genes) and a comparable number of controls (matched for gender and ethnicity) in Europe. This very large study allowed a reappraisal of the role of both the inherited and the somatic mtDNA variability in aging, as an association with longevity emerged only when mtDNA variants in OXPHOS complexes co-occurred. Moreover, the availability of data from both nuclear and mitochondrial genomes on a large number of subjects paves the way for an evaluation at a very large scale of the epistatic interactions at a higher level of complexity. This scenario is expected to be even more clarified in the next future with the use of next generation sequencing (NGS) techniques, which are becoming applicable to evaluate mtDNA variability and, then, new mathematical/bioinformatic analysis methods are urgently needed. Recent advances of association studies on age-related diseases and mtDNA variability will be also discussed in this review, taking into account the bias hidden by population stratification. Finally very recent findings in terms of mtDNA heteroplasmy (i.e. the coexistence of wild type and mutated copies of mtDNA) and aging as well as mitochondrial epigenetic mechanisms will be also discussed.
    Experimental gerontology 04/2014; 56. DOI:10.1016/j.exger.2014.03.022 · 3.53 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: MitoTool, a web-based bioinformatics platform, is designed for deciphering human mitochondrial DNA (mtDNA) data in batch mode. The platform has advantages in (i) parsing diverse types of mtDNA data; (ii) automatically classifying haplogroup according to mtDNA sequences or variants; (iii) discovering possibly missing variants of the samples with claimed haplogroups status; (iv) estimating the evolutionary conservation index, protein coding effect and potential pathogenicity of certain substitutions; (v) performing statistical analysis for haplogroup distribution frequency between case and control groups. Furthermore, it offers an integrated database for retrieving five types of mitochondrion-related information. The MitoTool is freely accessed at http://www.mitotool.org.
    Mitochondrion 10/2010; 11(2):351-6. DOI:10.1016/j.mito.2010.09.013 · 3.52 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The continuing exponential accumulation of full genome data, including full diploid human genomes, creates new challenges not only for understanding genomic structure, function and evolution, but also for the storage, navigation and privacy of genomic data. Here, we develop data structures and algorithms for the efficient storage of genomic and other sequence data that may also facilitate querying and protecting the data. The general idea is to encode only the differences between a genome sequence and a reference sequence, using absolute or relative coordinates for the location of the differences. These locations and the corresponding differential variants can be encoded into binary strings using various entropy coding methods, from fixed codes such as Golomb and Elias codes, to variables codes, such as Huffman codes. We demonstrate the approach and various tradeoffs using highly variables human mitochondrial genome sequences as a testbed. With only a partial level of optimization, 3615 genome sequences occupying 56 MB in GenBank are compressed down to only 167 KB, achieving a 345-fold compression rate, using the revised Cambridge Reference Sequence as the reference sequence. Using the consensus sequence as the reference sequence, the data can be stored using only 133 KB, corresponding to a 433-fold level of compression, roughly a 23% improvement. Extensions to nuclear genomes and high-throughput sequencing data are discussed. Data are publicly available from GenBank, the HapMap web site, and the MITOMAP database. Supplementary materials with additional results, statistics, and software implementations are available from http://mammag.web.uci.edu/bin/view/Mitowiki/ProjectDNACompression.
    Bioinformatics 06/2009; 25(14):1731-8. DOI:10.1093/bioinformatics/btp319 · 4.62 Impact Factor