HIV-Specific Probabilistic Models of Protein Evolution

University of Oxford, United Kingdom
PLoS ONE (Impact Factor: 3.23). 02/2007; 2(6):e503. DOI: 10.1371/journal.pone.0000503
Source: PubMed


Comparative sequence analyses, including such fundamental bioinformatics techniques as similarity searching, sequence alignment and phylogenetic inference, have become a mainstay for researchers studying type 1 Human Immunodeficiency Virus (HIV-1) genome structure and evolution. Implicit in comparative analyses is an underlying model of evolution, and the chosen model can significantly affect the results. In general, evolutionary models describe the probabilities of replacing one amino acid character with another over a period of time. Most widely used evolutionary models for protein sequences have been derived from curated alignments of hundreds of proteins, usually based on mammalian genomes. It is unclear to what extent these empirical models are generalizable to a very different organism, such as HIV-1-the most extensively sequenced organism in existence. We developed a maximum likelihood model fitting procedure to a collection of HIV-1 alignments sampled from different viral genes, and inferred two empirical substitution models, suitable for describing between-and within-host evolution. Our procedure pools the information from multiple sequence alignments, and provided software implementation can be run efficiently in parallel on a computer cluster. We describe how the inferred substitution models can be used to generate scoring matrices suitable for alignment and similarity searches. Our models had a consistently superior fit relative to the best existing models and to parameter-rich data-driven models when benchmarked on independent HIV-1 alignments, demonstrating evolutionary biases in amino-acid substitution that are unique to HIV, and that are not captured by the existing models. The scoring matrices derived from the models showed a marked difference from common amino-acid scoring matrices. The use of an appropriate evolutionary model recovered a known viral transmission history, whereas a poorly chosen model introduced phylogenetic error. We argue that our model derivation procedure is immediately applicable to other organisms with extensive sequence data available, such as Hepatitis C and Influenza A viruses.

Download full-text


Available from: Laura Heath,
15 Reads
  • Source
    • "Perhaps this could be derived from the application of not enough realistic models of HIV-1 evolution as suggested in [17]; see also [20]. Knowledge on HIV-1 molecular evolution can also be used to develop realistic models of evolution [21] [22] that can be applied for additional purposes such as the prediction of resistance mutations [23] or the evolutionary reply of the viral population, genotypic resistance testing [24] [25]. "
    [Show abstract] [Hide abstract]
    ABSTRACT: A variety of enzyme inhibitors have been developed in combating HIV-1, however the fast evolutionary rate of this virus commonly leads to the emergence of resistance mutations that finally allows the mutant virus to survive. This review explores the main genetic consequences of HIV-1 molecular evolution during antiviral therapies, including the viral genetic diversity and molecular adaptation. The role of recombination in the generation of drug resistance is also analyzed. Besides the investigation and discussion of published works, an evolutionary analysis of protease-coding genes collected from patients before and after treatment with different protease inhibitors was included to validate previous studies. Finally, the review discusses the importance of considering genetic consequences of antiviral therapies in models of HIV-1 evolution that could improve current genotypic resistance testing and treatments design.
    Computational and Mathematical Methods in Medicine 07/2015; 2015:1-9. DOI:10.1155/2015/395826 · 0.77 Impact Factor
  • Source
    • "All amino acid evolutionary rate models available in ProtTest were examined, as were the parameters " +I," " +G," and " +F " . (Dayhoff et al. 1978), JTT (Jones et al. 1992), WAG (Whelan and Goldman 2001), mtREV (Adachi and Hasegawa 1996), MtMam (Cao et al. 1994), VT (Müller and Vingron 2000), CpREV (Adachi et al. 2000), RtREV (Dimmic et al. 2002), MtArt (Abascal et al. 2007), HIVb/HIVw (Nickle et al. 2007), LG (Le and Gascuel 2008), and Blosum62 (Henikoff 1992). Ideally, we would optimize tree topology, branch lengths, and parameters of the model for each model investigated. "
    [Show abstract] [Hide abstract]
    ABSTRACT: In this paper we present a de novo assembly of the transcriptome of the damselfly, Enallagma hageni, through the use of 454 pyrosequencing. E. hageni is a member of the suborder Zygoptera within the order Odonata, and the Odonata are the basal lineage of the winged insects (Pterygota). To date, sequence data used in phylogenetic analysis of Enallagma species have been derived from either mtDNA or ribosomal nuclear DNA. This transcriptome contained 31,661 contigs that were assembled and translated into 14,813 individual open reading frames. Using these data, we constructed an extensive dataset of 634 orthologous nuclear protein-coding genes across 11 species of Arthropoda, and used Bayesian techniques to elucidate Enallagma's place in the Arthropod phylogenetic tree. Additionally, we demonstrate that the Enallagma transcriptome contains 169 genes that are evolving at rates that differ relative to the rest of the transcriptome (29 accelerated and 140 decreased), and through multiple Gene Ontology searches and clustering methods, we present the first functional-annotation of any palaeopteran's transcriptome in the literature.
    G3-Genes Genomes Genetics 03/2013; 3(4). DOI:10.1534/g3.113.005637 · 3.20 Impact Factor
  • Source
    • "Since CD4 counts and information on viral load were unavailable for the early samples, we employed computational approaches to infer diversity changes and to detect positive selection acting on the env gene. In particular, we are interested in whether positive selection pressure differs between the early and recent samples and between within-host and between-host evolution [16], [17]. We are also interested in detecting sites in the env gene targeted by the human immune system. "
    [Show abstract] [Hide abstract]
    ABSTRACT: HIV-1 infection has been on the rise in Japan recently, and the main transmission route has changed from blood transmission in the 1980s to homo- and/or hetero-sexual transmission in the 2000s. The lack of early viral samples with clinical information made it difficult to investigate the possible virological changes over time. In this study, we sequenced 142 full-length env genes collected from 16 Japanese subjects infected with HIV-1 in the 1980s and in the 2000s. We examined the diversity change in sequences and potential adaptive evolution of the virus to the host population. We used a codon-based likelihood method under the branch-site and clade models to detect positive selection operating on the virus. The clade model was extended to account for different positive selection pressures in different viral populations. The result showed that the selection pressure was weaker in the 2000s than in the 1980s, indicating that it might have become easier for the HIV to infect a new host and to develop into AIDS now than 20 years ago and that the HIV may be becoming more virulent in the Japanese population. The study provides useful information on the surveillance of HIV infection and highlights the utility of the extended clade models in analysis of virus populations which may be under different selection pressures.
    PLoS ONE 04/2011; 6(4):e18630. DOI:10.1371/journal.pone.0018630 · 3.23 Impact Factor
Show more