Automating HIV Drug Resistance Genotyping with RECall, a Freely Accessible Sequence Analysis Tool

BC Centre for Excellence in HIV/AIDS, Vancouver, British Columbia, Canada.
Journal of clinical microbiology (Impact Factor: 3.99). 03/2012; 50(6):1936-42. DOI: 10.1128/JCM.06689-11
Source: PubMed


Genotypic HIV drug resistance testing is routinely used to guide clinical decisions. While genotyping methods can be standardized, a slow, labor-intensive, and subjective manual sequence interpretation step is required. We therefore performed external validation of our custom software RECall, a fully automated sequence analysis pipeline. HIV-1 drug resistance genotyping was performed on 981 clinical samples at the Stanford Diagnostic Virology Laboratory. Sequencing trace files were first interpreted manually by a laboratory technician and subsequently reanalyzed by RECall, without intervention. The relative performances of the two methods were assessed by determination of the concordance of nucleotide base calls, identification of key resistance-associated substitutions, and HIV drug resistance susceptibility scoring by the Stanford Sierra algorithm. RECall is freely available at In total, 875 of 981 sequences were analyzed by both human and RECall interpretation. RECall analysis required minimal hands-on time and resulted in a 25-fold improvement in processing speed (∼150 technician-hours versus ∼6 computation-hours). Excellent concordance was obtained between human and automated RECall interpretation (99.7% agreement for >1,000,000 bases compared). Nearly all discordances (99.4%) were due to nucleotide mixtures being called by one method but not the other. Similarly, 98.6% of key antiretroviral resistance-associated mutations observed were identified by both methods, resulting in 98.5% concordance of resistance susceptibility interpretations. This automated sequence analysis tool provides both standardization of analysis and a significant improvement in data workflow. The time-consuming, error-prone, and dreadfully boring manual sequence analysis step is replaced with a fully automated system without compromising the accuracy of reported HIV drug resistance data.

Download full-text


Available from: Chanson J Brumme, Jan 21, 2015
47 Reads
    • "Plasma samples were processed as previously described using a 1770 bp amplicon PCR spanning codons 1–99 of protease and 1–430 of reverse transcriptase (Pillay et al., 2008). Editing of sequences was performed using RECall software v2.10 (Woods et al., 2012), and drug resistant mutations were identified using the Stanford HIV database algorithm V7.0 (http:// Inter-rater agreements were calculated using Cohen's kappa coefficient (STATA v11, StataCorp, College Station, USA). "
    [Show abstract] [Hide abstract]
    ABSTRACT: Paired plasma and dried blood spots (DBS) from 232 South African HIV-infected children initiating antiretroviral therapy (ART) were genotyped for drug resistance mutations, most of who had prior exposure to ART for prevention-of-mother-to-child-transmission. Non-nucleoside reverse transcriptase inhibitor mutations were most commonly detected in both specimen types, particularly Y181C/I and K103N/S. Resistance interpretation concordance was achieved in 97% of pairs with 7 children having mutations detected in DBS only. These results validate the preferential use of DBS specimens for HIVDR genotyping in this patient group. Copyright © 2015. Published by Elsevier B.V.
    Journal of virological methods 07/2015; 223. DOI:10.1016/j.jviromet.2015.07.005 · 1.78 Impact Factor
  • Source
    • "Amplicons were bulk-sequenced bidirectionally on a 3130xl and/or 3730xl automated DNA sequencer (Applied Biosystems). Chromatograms were analyzed using Sequencher v5.0 (Genecodes) or RECall [51] with nucleotide mixtures called if the height of the secondary peak exceeded 25% of the height of the dominant peak (Sequencher) or 20% of the dominant peak area (RECall). HIV-1 sequences were confirmed as subtype B using the recombinant identification program (RIP; and aligned to the HIV-1 subtype B reference strain HXB2. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Background The reproducible nature of HIV-1 escape from HLA-restricted CD8+ T-cell responses allows the identification of HLA-associated viral polymorphisms ¿at the population level¿ ¿ that is, via analysis of cross-sectional, linked HLA/HIV-1 genotypes by statistical association. However, elucidating their timing of selection traditionally requires detailed longitudinal studies, which are challenging to undertake on a large scale. We investigate whether the extent and relative timecourse of immune-driven HIV adaptation can be inferred via comparative cross-sectional analysis of independent early and chronic infection cohorts.ResultsSimilarly-powered datasets of linked HLA/HIV-1 genotypes from individuals with early (median¿<¿3 months) and chronic untreated HIV-1 subtype B infection, matched for size (N¿>¿200/dataset), HLA class I and HIV-1 Gag/Pol/Nef diversity, were established. These datasets were first used to define a list of 162 known HLA-associated polymorphisms detectable at the population level in cohorts of the present size and host/viral genetic composition. Of these 162 known HLA-associated polymorphisms, 15% (occurring at 14 Gag, Pol and Nef codons) were already detectable via statistical association in the early infection dataset at p¿¿¿0.01 (q¿<¿0.2) ¿ identifying them as the most consistently rapidly escaping sites in HIV-1. Among these were known rapidly-escaping sites (e.g. B*57-Gag-T242N) and others not previously appreciated to be reproducibly rapidly selected (e.g. A*31:01-associated adaptations at Gag codons 397, 401 and 403). Escape prevalence in early infection correlated strongly with first-year escape rates (Pearson¿s R¿=¿0.68, p¿=¿0.0001), supporting cross-sectional parameters as reliable indicators of longitudinally-derived measures. Comparative analysis of early and chronic datasets revealed that, on average, the prevalence of HLA-associated polymorphisms more than doubles between these two infection stages in persons harboring the relevant HLA (p¿<¿0.0001, consistent with frequent and reproducible escape), but remains relatively stable in persons lacking the HLA (p¿=¿0.15, consistent with slow reversion). Published HLA-specific Hazard Ratios for progression to AIDS correlated positively with average escape prevalence in early infection (Pearson¿s R¿=¿0.53, p¿=¿0.028), consistent with high early within-host HIV-1 adaptation (via rapid escape and/or frequent polymorphism transmission) as a correlate of progression.Conclusion Cross-sectional host/viral genotype datasets represent an underutilized resource to identify reproducible early pathways of HIV-1 adaptation and identify correlates of protective immunity.
    Retrovirology 08/2014; 11(1):64. DOI:10.1186/PREACCEPT-8878001841312932 · 4.19 Impact Factor
  • Source
    • "Ambiguous mutations, which consist of mixed nucleotides at a sequence position and named using the standard IUPAC ambiguous nucleotide codes, were determined and automatically called using customized software, Recall [37] when the sequencing signal intensity of the minor base was ≥20% of the major base signal at a nucleotide position on bi-directional sequences after subtracting background noise. Ambiguous mutations were extracted from each of sequences and tallied at country level. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Detection of recent HIV infections is a prerequisite for reliable estimations of transmitted HIV drug resistance (t-HIVDR) and incidence. However, accurately identifying recent HIV infection is challenging due partially to the limitations of current serological tests. Ambiguous nucleotides are newly emerged mutations in quasispecies, and accumulate by time of viral infection. We utilized ambiguous mutations to establish a measurement for detecting recent HIV infection and monitoring early HIVDR development. Ambiguous nucleotides were extracted from HIV-1 pol-gene sequences in the datasets of recent (HIVDR threshold surveys [HIVDR-TS] in 7 countries; n=416) and established infections (1 HIVDR monitoring survey at baseline; n=271). An ambiguous mutation index of 2.04×10(-3) nts/site was detected in HIV-1 recent infections which is equivalent to the HIV-1 substitution rate (2×10(-3) nts/site/year) reported before. However, significantly higher index (14.41×10(-3) nts/site) was revealed with established infections. Using this substitution rate, 75.2% subjects in HIVDR-TS with the exception of the Vietnam dataset and 3.3% those in HIVDR-baseline were classified as recent infection within one year. We also calculated mutation scores at amino acid level at HIVDR sites based on ambiguous or fitted mutations. The overall mutation scores caused by ambiguous mutations increased (0.54×10(-2)3.48×10(-2)/DR-site) whereas those caused by fitted mutations remained stable (7.50-7.89×10(-2)/DR-site) in both recent and established infections, indicating that t-HIVDR exists in drug-naïve populations regardless of infection status in which new HIVDR continues to emerge. Our findings suggest that characterization of ambiguous mutations in HIV may serve as an additional tool to differentiate recent from established infections and to monitor HIVDR emergence.
    PLoS ONE 10/2013; 8(10):e77649. DOI:10.1371/journal.pone.0077649 · 3.23 Impact Factor
Show more