Automating HIV Drug Resistance Genotyping with RECall, a Freely Accessible Sequence Analysis Tool

BC Centre for Excellence in HIV/AIDS, Vancouver, British Columbia, Canada.
Journal of clinical microbiology (Impact Factor: 3.99). 03/2012; 50(6):1936-42. DOI: 10.1128/JCM.06689-11
Source: PubMed


Genotypic HIV drug resistance testing is routinely used to guide clinical decisions. While genotyping methods can be standardized, a slow, labor-intensive, and subjective manual sequence interpretation step is required. We therefore performed external validation of our custom software RECall, a fully automated sequence analysis pipeline. HIV-1 drug resistance genotyping was performed on 981 clinical samples at the Stanford Diagnostic Virology Laboratory. Sequencing trace files were first interpreted manually by a laboratory technician and subsequently reanalyzed by RECall, without intervention. The relative performances of the two methods were assessed by determination of the concordance of nucleotide base calls, identification of key resistance-associated substitutions, and HIV drug resistance susceptibility scoring by the Stanford Sierra algorithm. RECall is freely available at In total, 875 of 981 sequences were analyzed by both human and RECall interpretation. RECall analysis required minimal hands-on time and resulted in a 25-fold improvement in processing speed (∼150 technician-hours versus ∼6 computation-hours). Excellent concordance was obtained between human and automated RECall interpretation (99.7% agreement for >1,000,000 bases compared). Nearly all discordances (99.4%) were due to nucleotide mixtures being called by one method but not the other. Similarly, 98.6% of key antiretroviral resistance-associated mutations observed were identified by both methods, resulting in 98.5% concordance of resistance susceptibility interpretations. This automated sequence analysis tool provides both standardization of analysis and a significant improvement in data workflow. The time-consuming, error-prone, and dreadfully boring manual sequence analysis step is replaced with a fully automated system without compromising the accuracy of reported HIV drug resistance data.

Download full-text


Available from: Chanson J Brumme, Jan 21, 2015
  • Source
    • "NC_004102]) using a method previously described (Lamoury, Jacka et al. 2015). PCR amplicons were sequenced by Sanger sequencing and sequence chromatograms were processed using RECall: a fully automated sequence analysis pipeline (Woods, Brumme et al. 2012). Subtypes were determined by constructing a subtyping tree using the panel of reference sequences classified by Smith et al (Smith, Bukh et al. 2014) (Supplementary Figure 1.). "
    [Show abstract] [Hide abstract]
    ABSTRACT: The aim of this study was to identify factors associated with phylogenetic clustering among people with recently acquired hepatitis C virus (HCV) infection. Participants with available sample at time of HCV detection were selected from three studies; the Australian Trial in Acute Hepatitis C, the Hepatitis C Incidence and Transmission Study - Prison and Community. HCV RNA was extracted and Core to E2 region of HCV sequenced. Clusters were identified from maximum likelihood trees with 1000 bootstrap replicates using 90% bootstrap and 5% genetic distance threshold. Among 225 participants with available Core-E2 sequence (ATAHC, n=113; HITS-p, n=90; and HITS-c, n=22), HCV genotype prevalence was: G1a: 38% (n=86), G1b: 5% (n=12), G2a: 1% (n=2), G2b: 5% (n=11), G3a: 48% (n=109), G6a: 1% (n=2) and G6l 1% (n=3). Of participants included in phylogenetic trees, 22% of participants were in a pair/cluster (G1a-35%, 30/85, mean maximum genetic distance =0.031; G3a-11%, 12/106, mean maximum genetic distance =0.021; other genotypes-21%, 6/28, mean maximum genetic distance =0.023). Among HCV/HIV co-infected participants, 50% (18/36) were in a pair/cluster, compared to 16% (30/183) with HCV mono-infection (P=<0.001). Factors independently associated with phylogenetic clustering were HIV co-infection [vs. HCV mono-infection; adjusted odds ratio (AOR) 4.24; 95%CI 1.91, 9.39], and HCV G1a infection (vs. other HCV genotypes; AOR 3.33, 95%CI 0.14, 0.61).HCV treatment and prevention strategies, including enhanced antiviral therapy, should be optimised. The impact of targeting of HCV treatment as prevention to populations with higher phylogenetic clustering, such as those with HIV co-infection, could be explored through mathematical modelling.
    Full-text · Article · Nov 2015 · Infection, genetics and evolution: journal of molecular epidemiology and evolutionary genetics in infectious diseases
    • "Plasma samples were processed as previously described using a 1770 bp amplicon PCR spanning codons 1–99 of protease and 1–430 of reverse transcriptase (Pillay et al., 2008). Editing of sequences was performed using RECall software v2.10 (Woods et al., 2012), and drug resistant mutations were identified using the Stanford HIV database algorithm V7.0 (http:// Inter-rater agreements were calculated using Cohen's kappa coefficient (STATA v11, StataCorp, College Station, USA). "
    [Show abstract] [Hide abstract]
    ABSTRACT: Paired plasma and dried blood spots (DBS) from 232 South African HIV-infected children initiating antiretroviral therapy (ART) were genotyped for drug resistance mutations, most of who had prior exposure to ART for prevention-of-mother-to-child-transmission. Non-nucleoside reverse transcriptase inhibitor mutations were most commonly detected in both specimen types, particularly Y181C/I and K103N/S. Resistance interpretation concordance was achieved in 97% of pairs with 7 children having mutations detected in DBS only. These results validate the preferential use of DBS specimens for HIVDR genotyping in this patient group. Copyright © 2015. Published by Elsevier B.V.
    No preview · Article · Jul 2015 · Journal of virological methods
  • Source
    • "Amplicons were bulk-sequenced bidirectionally on a 3130xl and/or 3730xl automated DNA sequencer (Applied Biosystems). Chromatograms were analyzed using Sequencher v5.0 (Genecodes) or RECall [51] with nucleotide mixtures called if the height of the secondary peak exceeded 25% of the height of the dominant peak (Sequencher) or 20% of the dominant peak area (RECall). HIV-1 sequences were confirmed as subtype B using the recombinant identification program (RIP; and aligned to the HIV-1 subtype B reference strain HXB2. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Background The reproducible nature of HIV-1 escape from HLA-restricted CD8+ T-cell responses allows the identification of HLA-associated viral polymorphisms ¿at the population level¿ ¿ that is, via analysis of cross-sectional, linked HLA/HIV-1 genotypes by statistical association. However, elucidating their timing of selection traditionally requires detailed longitudinal studies, which are challenging to undertake on a large scale. We investigate whether the extent and relative timecourse of immune-driven HIV adaptation can be inferred via comparative cross-sectional analysis of independent early and chronic infection cohorts.ResultsSimilarly-powered datasets of linked HLA/HIV-1 genotypes from individuals with early (median¿<¿3 months) and chronic untreated HIV-1 subtype B infection, matched for size (N¿>¿200/dataset), HLA class I and HIV-1 Gag/Pol/Nef diversity, were established. These datasets were first used to define a list of 162 known HLA-associated polymorphisms detectable at the population level in cohorts of the present size and host/viral genetic composition. Of these 162 known HLA-associated polymorphisms, 15% (occurring at 14 Gag, Pol and Nef codons) were already detectable via statistical association in the early infection dataset at p¿¿¿0.01 (q¿<¿0.2) ¿ identifying them as the most consistently rapidly escaping sites in HIV-1. Among these were known rapidly-escaping sites (e.g. B*57-Gag-T242N) and others not previously appreciated to be reproducibly rapidly selected (e.g. A*31:01-associated adaptations at Gag codons 397, 401 and 403). Escape prevalence in early infection correlated strongly with first-year escape rates (Pearson¿s R¿=¿0.68, p¿=¿0.0001), supporting cross-sectional parameters as reliable indicators of longitudinally-derived measures. Comparative analysis of early and chronic datasets revealed that, on average, the prevalence of HLA-associated polymorphisms more than doubles between these two infection stages in persons harboring the relevant HLA (p¿<¿0.0001, consistent with frequent and reproducible escape), but remains relatively stable in persons lacking the HLA (p¿=¿0.15, consistent with slow reversion). Published HLA-specific Hazard Ratios for progression to AIDS correlated positively with average escape prevalence in early infection (Pearson¿s R¿=¿0.53, p¿=¿0.028), consistent with high early within-host HIV-1 adaptation (via rapid escape and/or frequent polymorphism transmission) as a correlate of progression.Conclusion Cross-sectional host/viral genotype datasets represent an underutilized resource to identify reproducible early pathways of HIV-1 adaptation and identify correlates of protective immunity.
    Full-text · Article · Aug 2014 · Retrovirology
Show more