GARD: a genetic algorithm for recombination detection.

Department of Pathology, University of California San Diego, La Jolla, CA 92093, USA.
Bioinformatics (Impact Factor: 4.62). 01/2007; 22(24):3096-8. DOI: 10.1093/bioinformatics/btl474
Source: PubMed

ABSTRACT Phylogenetic and evolutionary inference can be severely misled if recombination is not accounted for, hence screening for it should be an essential component of nearly every comparative study. The evolution of recombinant sequences can not be properly explained by a single phylogenetic tree, but several phylogenies may be used to correctly model the evolution of non-recombinant fragments.
We developed a likelihood-based model selection procedure that uses a genetic algorithm to search multiple sequence alignments for evidence of recombination breakpoints and identify putative recombinant sequences. GARD is an extensible and intuitive method that can be run efficiently in parallel. Extensive simulation studies show that the method nearly always outperforms other available tools, both in terms of power and accuracy and that the use of GARD to screen sequences for recombination ensures good statistical properties for methods aimed at detecting positive selection.
Freely available

1 Bookmark
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Inference of evolutionary relationships among closely related darter species (Teleostei: Percidae) has traditionally proven challenging due to a lack of sufficient numbers of informative morphological characters or reliance on mtDNA sequences. These factors have contributed to longstanding uncertainty of the monophyly of many described taxonomic groups. Although multi-locus data are now available for most darter species, uncertainty has persisted regarding the relationships of some major lineages. Here, we investigate the relationships of darters classified in Goneaperca, a clade of 46 species, many of which are characterized by distinct nuptial displays and male-only parental care. Previous phylogenetic analyses of morphological and molecular data have failed to provide strong resolution of relationships among major Goneaperca subclades, and especially the monophyly of Catonotus. We apply coalescent and phylogenetic analyses to a dataset that includes intraspecific sampling for nearly all species of Goneaperca for 13 nuclear genes. Our coalescent species tree analyses resolved a strongly supported sister relationship between Boleosoma and a monophyletic Catonotus. Ancestor state reconstructions using the posterior distribution of these newly inferred phylogenies support a single origin of male-only parental care in the most recent common ancestor of Boleosoma and Catonotus.
    Molecular Phylogenetics and Evolution 01/2015; 84:158-165. DOI:10.1016/j.ympev.2015.01.002 · 4.02 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Hepatitis C virus (HCV) infection is characterized by persistent replication of a complex mixture of viruses termed a "quasispecies." Transmission is generally associated with a stringent population bottleneck characterized by infection by limited numbers of "transmitted/founder" (T/F) viruses. Characterization of T/F genomes of human immunodeficiency virus type 1 (HIV-1) has been integral to studies of transmission, immunopathogenesis, and vaccine development. Here, we describe the identification of complete T/F genomes of HCV by single-genome sequencing of plasma viral RNA from acutely infected subjects. A total of 2,739 single-genome-derived amplicons comprising 10,966,507 bp from 18 acute-phase and 11 chronically infected subjects were analyzed. Acute-phase sequences diversified essentially randomly, except for the poly(U/UC) tract, which was subject to polymerase slippage. Fourteen acute-phase subjects were productively infected by more than one genetically distinct virus, permitting assessment of recombination between replicating genomes. No evidence of recombination was found among 1,589 sequences analyzed. Envelope sequences of T/F genomes lacked transmission signatures that could distinguish them from chronic infection viruses. Among chronically infected subjects, higher nucleotide substitution rates were observed in the poly(U/UC) tract than in envelope hypervariable region 1. Fourteen full-length molecular clones with variable poly(U/UC) sequences corresponding to seven genotype 1a, 1b, 3a, and 4a T/F viruses were generated. Like most unadapted HCV clones, T/F genomes did not replicate efficiently in Huh 7.5 cells, indicating that additional cellular factors or viral adaptations are necessary for in vitro replication. Full-length T/F HCV genomes and their progeny provide unique insights into virus transmission, virus evolution, and virus-host interactions associated with immunopathogenesis. Hepatitis C virus (HCV) infects 2% to 3% of the world's population and exhibits extraordinary genetic diversity. This diversity is mirrored by HIV-1, where characterization of transmitted/founder (T/F) genomes has been instrumental in studies of virus transmission, immunopathogenesis, and vaccine development. Here, we show that despite major differences in genome organization, replication strategy, and natural history, HCV (like HIV-1) diversifies essentially randomly early in infection, and as a consequence, sequences of actual T/F viruses can be identified. This allowed us to capture by molecular cloning the full-length HCV genomes that are responsible for infecting the first hepatocytes and eliciting the initial immune responses, weeks before these events could be directly analyzed in human subjects. These findings represent an enabling experimental strategy, not only for HCV and HIV-1 research, but also for other RNA viruses of medical importance, including West Nile, chikungunya, dengue, Venezuelan encephalitis, and Ebola viruses. Copyright © 2015 Stoddard et al.
    mBio 01/2015; 6(2). DOI:10.1128/mBio.02518-14 · 6.88 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: The evolution of citrus tristeza virus (CTV) from outbreaks occurred in Calabria, Italy, was compared with that of CTV outbreaks reported previously in another two proximal Italian regions, Sicily and Apulia. Examination of four genomic regions (genes p20, p25 and p23, and one fragment of open reading frame 1) showed two recombination events, and phylogenetic analysis disclosed two divergent CTV groups in Calabria: one formed by severe and the other by mild isolates. This analysis, together with others involving population genetic parameters, revealed a low migration rate of CTV between the three Italian regions, as well as significant differences in selective pressures, epidemiology and demography, all affecting the genetic structure of CTV populations.
    European Journal of Plant Pathology 11/2014; 140(3):607-613. DOI:10.1007/s10658-014-0489-3 · 1.71 Impact Factor

Full-text (2 Sources)

Available from
Jun 1, 2014