GARD: A genetic algorithm for recombination detection

Department of Pathology, University of California San Diego, La Jolla, CA 92093, USA.
Bioinformatics (Impact Factor: 4.98). 01/2007; 22(24):3096-8. DOI: 10.1093/bioinformatics/btl474
Source: PubMed


Phylogenetic and evolutionary inference can be severely misled if recombination is not accounted for, hence screening for it should be an essential component of nearly every comparative study. The evolution of recombinant sequences can not be properly explained by a single phylogenetic tree, but several phylogenies may be used to correctly model the evolution of non-recombinant fragments.
We developed a likelihood-based model selection procedure that uses a genetic algorithm to search multiple sequence alignments for evidence of recombination breakpoints and identify putative recombinant sequences. GARD is an extensible and intuitive method that can be run efficiently in parallel. Extensive simulation studies show that the method nearly always outperforms other available tools, both in terms of power and accuracy and that the use of GARD to screen sequences for recombination ensures good statistical properties for methods aimed at detecting positive selection.
Freely available


Available from: Simon D W Frost
    • "The alignment was manually edited in Bioedit, version 7.05 to preserve frame insertions and deletions if present. Because recombination may confound the results of phylogeographic inference [Schierup and Hein, 2000], all data sets for phylogeographic analyses were verified and tested negative for recombination using RIP 3.0 [Siepel et al., 1995] and GARD [Pond et al., 2006]. The sequences of each subject have been submitted to GenBank (GenBank accession number: KP796426 -KP797835). "
    [Show abstract] [Hide abstract]
    ABSTRACT: The cellular source of HIV RNA circulating in blood plasma remains unclear. Here, we investigated whether sequence analysis of HIV RNA populations circulating before combination antiretroviral therapy (cART) and HIV DNA populations in cellular subsets (CS) after cART could identify the cellular sources of circulating HIV RNA. Blood was collected from five subjects at cART initiation and again 6 months later. Naïve CD4+ T cells, resting central memory and effector memory CD4+ T cells, activated CD4+ T cells, monocytes, and natural killer cells were sorted using a fluorescence-activated cell sorter. HIV-1 env C2V3 sequences from HIV RNA in blood plasma and HIV DNA in CSs were generated using single genome sequencing. Sequences were evaluated for viral compartmentalization (Fst test) and migration events (MEs; Slatkin Maddison and cladistic measures) between blood plasma and each CS. Viral compartmentalization was observed in 88% of all cellular subset comparisons (range: 77-100% for each subject). Most observed MEs were directed from blood plasma to CSs (52 MEs, 85.2%). In particular, there was only viral movement from plasma to NK cells (15 MEs), monocytes (7 MEs) and naïve cells (5 ME). We observed a total of 9 MEs from activated CD4 cells (2/9 MEs), central memory T cells (3/9 MEs) and effector memory T cells (4/9 MEs) to blood plasma. Our results revealed that the HIV RNA population in blood plasma plays an important role in seeding various cellular reservoirs and that the cellular source of the HIV RNA population is activated central memory and effector memory T cells. This article is protected by copyright. All rights reserved.
    Journal of Medical Virology 09/2015; DOI:10.1002/jmv.24375 · 2.35 Impact Factor
  • Source
    • "which implements statistical tests associated with the programme HyPhy (Pond et al. 2005), was used to analyse an alignment of all unique haplotypes in the Welsh populations with the laboratory sequences downloaded from GenBank. We used GARD (Genetic Algorithm for Recombination Detection; Pond et al. 2006) to test for evidence for recombination prior to using tests of selection. We chose the best fitting substitution model using CodonTest (Delport et al. 2010), and then various codon-based tests for selection were implemented. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Xenopus laevis (the African clawed frog), which originated through hybridisation and whole genome duplication, has been used as a model for genetics and development for many years, but surprisingly little is known about immune gene variation in natural populations. The purpose of this study was to use an isolated population of X. laevis that was introduced to Wales, UK in the past 50 years to investigate how variation at the MHC compares to that at other loci, following a severe population bottleneck. Among 18 individuals, we found nine alleles based on exon 2 sequences of the Class IIb region (which includes the peptide binding region). Individuals carried from one to three of the loci identified from previous laboratory studies. Genetic variation was an order of magnitude higher at the MHC compared with three single-copy nuclear genes, but all loci showed high levels of heterozygosity and nucleotide diversity and there was not an excess of homozygosity or decrease in diversity over time that would suggest extensive inbreeding in the introduced population. Tajima's D was positive for all loci, which is consistent with a bottleneck. Moreover, comparison with published sequences identified the source of the introduced population as the Western Cape region of South Africa, where most commercial suppliers have obtained their stocks. These factors suggest that despite founding by potentially already inbred individuals, the alien population in Wales has maintained substantial genetic variation at both adaptively important and neutral genes.
    Immunogenetics 09/2015; 67(10). DOI:10.1007/s00251-015-0860-3 · 2.23 Impact Factor
  • Source
    • "In order to increase the effective sample size (ESS > 200), the analysis was run in duplicate for each dataset, with 50 million chain lengths and sampled every 1000 states. In order to estimate the rate dN/dS and to investigate the selective pressure acting on specific codons in VP1 gene, SLAC, FEL (Kosakovsky Pond and Frost, 2005), FUBAR (Murrell et al., 2013) and IFEL (Pond et al., 2006) methods were used, provided from Datamonkey website. Almost the complete genome sequence of strain EIS6B (from 1 to 7365 nt) has been deposited to GenBank under the accession number KM024043. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Echovirus 3 (E3) serotype has been related with several neurologic diseases, although it constitutes one of the rarely isolated serotypes, with no report of epidemics in Europe. The aim of the present study was to provide insights into the molecular epidemiology and evolution of this enterovirus serotype, while an E3 strain was isolated from sewage in Greece, four years after the initial isolation of the only reported E3 strain in the same geographical region. Phylogenetic analysis of the complete VP1 genomic region of that E3 strain and of those available in GenBank suggested three main genogroups that were further subdivided into seven subgenogroups. Further evolutionary analysis suggested that VP1 genomic region of E3 was dominated by purifying selection, as the vast majority of genetic diversity presumably occurred through synonymous nucleotide substitutions and the substitution rate for complete and partial VP1 sequences was calculated to be 8.13 x 10(-3)and 7.72 x 10(-3) substitutions/ site/ year respectively. The partial VP1 sequence analysis revealed the composite epidemiology of this serotype, as the strains of the three genogroups presented different epidemiological characteristics. Copyright © 2015. Published by Elsevier B.V.
    Infection, genetics and evolution: journal of molecular epidemiology and evolutionary genetics in infectious diseases 03/2015; 32. DOI:10.1016/j.meegid.2015.03.008 · 3.02 Impact Factor
Show more