Wang C, Mitsuya Y, Gharizadeh B, et al.. Characterization of mutation spectra with ultra-deep pyrosequencing: application to HIV-1 drug resistance

Division of Infectious Diseases, Department of Medicine, Stanford University, Stanford, CA 94305, USA.
Genome Research (Impact Factor: 14.63). 09/2007; 17(8):1195-201. DOI: 10.1101/gr.6468307
Source: PubMed

ABSTRACT The detection of mutant spectra within a population of microorganisms is critical for the management of drug-resistant infections. We performed ultra-deep pyrosequencing to detect minor sequence variants in HIV-1 protease and reverse transcriptase (RT) genes from clinical plasma samples. We estimated empirical error rates from four HIV-1 plasmid clones and used them to develop a statistical approach to distinguish authentic minor variants from sequencing errors in eight clinical samples. Ultra-deep pyrosequencing detected an average of 58 variants per sample compared with an average of eight variants per sample detected by conventional direct-PCR dideoxynucleotide sequencing. In the clinical sample with the largest number of minor sequence variants, all 60 variants present in > or =3% of genomes and 20 of 35 variants present in <3% of genomes were confirmed by limiting dilution sequencing. With appropriate analysis, ultra-deep pyrosequencing is a promising method for characterizing genetic diversity and detecting minor yet clinically relevant variants in biological samples with complex genetic populations.

1 Follower
12 Reads
  • Source
    • "The advent of so-called ''next-generation'' sequencing (NGS) technologies, with their potential to survey thousands of viral sequences from a given host, has dramatically improved our ability to characterize within-host sequence diversity in viral infections . NGS has been applied to address such questions as overall viral diversity within-hosts (Lauck et al., 2012; Wright et al., 2011); evolution of T-cell epitopes under selection by the host immune system (Bimber et al., 2010; Hughes et al., 2010, 2012; Mudd et al., 2012; O'Connor et al., 2012; Walsh et al., 2013); response of viruses to selection imposed by antiviral drugs (Cannon et al., 2008; Hedskog et al., 2010; Le et al., 2009; Wang et al., 2010a); differences between virus subpopulations infecting different host cell types (Rozera et al., 2009); and population bottlenecks in infection (Wang et al., 2010b). Here we discuss statistical methods for using NGS data to understand nucleotide sequence diversity of within-host viral populations , with particular emphasis on the comparison of synonymous and nonsynonymous (amino acid-altering) nucleotide diversity in coding regions. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Next-generation sequencing (NGS) technology offers new opportunities for understanding the evolution and dynamics of viral populations within individual hosts over the course of infection. We review simple methods for estimating synonymous and nonsynonymous nucleotide diversity in viral genes from NGS data without the need for inferring linkage. We discuss the potential usefulness of these data for addressing questions of both practical and theoretical interest, including fundamental questions regarding the effective population sizes of within-host viral populations and the modes of natural selection acting on them. Copyright © 2014. Published by Elsevier B.V.
    Infection, genetics and evolution: journal of molecular epidemiology and evolutionary genetics in infectious diseases 03/2015; DOI:10.1016/j.meegid.2014.11.026 · 3.02 Impact Factor
  • Source
    • "The ability to discover rare viral variants makes our tool applicable for monitoring and quantifying an HIV population structure to dissect its evolutionary landscape and study genomic interaction. In particular, our approach allows for the discovery of rare mutations and variants that are of particular interest because of their potential influence on drug resistance and treatment failure (Liu et al., 2011; Palmer et al., 2006; Wang et al., 2007). "
    [Show abstract] [Hide abstract]
    ABSTRACT: Motivation: Next-generation sequencing technologies sequence viruses with ultra-deep coverage, thus promising to revolutionize our understanding of the underlying diversity of viral populations. While the sequencing coverage is high enough that even rare viral variants are sequenced, the presence of sequencing errors makes it difficult to distinguish between rare variants and sequencing errors. Results: In this article, we present a method to overcome the limitations of sequencing technologies and assemble a diverse viral population that allows for the detection of previously undiscovered rare variants. The proposed method consists of a high-fidelity sequencing protocol and an accurate viral population assembly method, referred to as Viral Genome Assembler (VGA). The proposed protocol is able to eliminate sequencing errors by using individual barcodes attached to the sequencing fragments. Highly accurate data in combination with deep coverage allow VGA to assemble rare variants. VGA uses an expectation–maximization algorithm to estimate abundances of the assembled viral variants in the population. Results on both synthetic and real datasets show that our method is able to accurately assemble an HIV viral population and detect rare variants previously undetectable due to sequencing errors. VGA outperforms state-of-the-art methods for genome-wide viral assembly. Furthermore, our method is the first viral assembly method that scales to millions of sequencing reads. Availability: Our tool VGA is freely available at Contact:;
    Bioinformatics 06/2014; 30(12):i329-i337. DOI:10.1093/bioinformatics/btu295 · 4.62 Impact Factor
  • Source
    • "Existing tools for genetic diversity evaluation in viral NGS-generated sequences, intended for 454 and Illumina platforms (Beerenwinkel et al. 2012; Goya et al. 2010; Prosperi et al. 2011; Willerth et al. 2010; Zagordi et al. 2012; Wang et al. 2007; Fischer et al. 2010; Lataillade et al. 2010; Tsibris et al. 2009), are based on several techniques aiming at SNV calling and haplotype reconstruction, as well. Most of NSGemploying HIV studies so far published are based on the 454 platform (Wang et al. 2007; Bruselles et al. 2009; Eshleman et al. 2011; Macalalad et al. 2012; Wang et al. 2007; Koboldt et al. 2012; Eriksson et al. 2008; Westbrooks et al. 2008). "
    [Show abstract] [Hide abstract]
    ABSTRACT: In this paper we describe the structure and use of a computational tool for the analysis of viral genetic diversity on data generated by high- throughput sequencing. The main motivation for this work is to better understand the genetic diversity of viruses with high rates of nucleotide substitution, as HIV-1 and Influenza. This work focuses on two main fronts: the first is a novel alignment strategy that allows the recovery of the highest possible number of short-reads; the second is the estimation of the populational genetic diversity through a Bayesian approach based on Dirichlet distributions inspired by word count modeling. The software is available as an integrated platform capable of performing all operations described here, it is written in C# (Microsoft) and runs on Windows platforms. The executable, the documentation and the auxiliary files are freely available and may be obtained from:
Show more


12 Reads
Available from