About
91
Publications
13,710
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
6,225
Citations
Publications
Publications (91)
Background
Sexually transmitted infections spread across contact networks. Partner elicitation and notification are commonly used public health tools to identify, notify, and offer testing to persons linked in these contact networks. For HIV-1, a rapidly evolving pathogen with low per-contact transmission rates, viral genetic sequences are an addit...
Logistic regression analysis of index case being genetically-linked to at least one of their named partners for index cases who named only 1 partner.
(DOCX)
Univariate logistic regression analysis of index case being genetically-linked to at least one of their named partners.
(DOCX)
Multivariate regression analysis of index cases being genetically-linked to their named partners.
(DOCX)
Combined named partner and genetic (≤0.0175 substitutions/site) networks.
Shaded nodes are genetically linked to at least one named partner. Bold edges indicate partner naming that is supported by genetic distance. Edges with arrows indicate direction of partner naming. Edges without arrows are links supported only by genetic distance.
(EPS)
Concordance between named partner and genetic (≤0.015 substitutions/site) networks.
(A) Genetic data mapped onto named partner network. Edges indicate partner naming. (B) Partner naming data mapped onto genetic network. Edges indicate genetic linkage.
(EPS)
Parameter estimates for the mixture distribution of Gamma and Gaussian (normal) distributions.
(DOCX)
Objectives:
Curative strategies using agents to perturb the HIV reservoir have demonstrated only modest activity, whereas increases in viremia after standard vaccination have been described. We investigated whether vaccination against non-HIV pathogens can induce HIV transcription and thereby play a role in future eradication strategies.
Design:...
To design effective eradication strategies, it may be necessary to target HIV reservoirs in anatomic compartments other than blood. This study examined HIV RNA rebound following interruption of antiretroviral therapy (ART) in blood and cerebrospinal fluid (CSF) to determine whether the central nervous system (CNS) might serve as an independent sour...
Background:
Because recently infected individuals disproportionately contribute to the spread of human immunodeficiency virus (HIV), we evaluated the impact of a primary HIV screening program (the Early Test) implemented in San Diego.
Methods:
The Early Test program used combined nucleic acid and serology testing to screen for primary infection...
Background:
Measurement of HIV DNA-bearing cells in cerebrospinal fluid (CSF) is challenging because few cells are present. We present a novel application of the sensitive droplet digital (dd)PCR in this context.
Methods:
We analyzed CSF cell pellets and paired peripheral blood mononuclear cells (PBMC) from 28 subjects, 19 of whom had undetectab...
Over the past two decades, comparative sequence analysis using codon-substitution models has been honed into a powerful and popular approach for detecting signatures of natural selection from molecular data. A substantial body of work has focused on developing a class of "branch-site" models which permit selective pressures on sequences, quantified...
We present BUSTED, a new approach to identifying gene-wide evidence of episodic positive selection, where the non-synonymous substitution rate is transiently greater than the synonymous rate. BUSTED can be used either on an entire phylogeny (without requiring an a priori hypothesis regarding which branches are under positive selection) or on a pre-...
Relaxation of selective strength, manifested as a reduction in the efficiency or intensity of natural selection, can drive evolutionary innovation and presage lineage extinction or loss of function. Mechanisms through which selection can be relaxed range from the removal of an existing selective constraint to a reduction in effective population siz...
Since its identification in 1983, HIV-1 has been the focus of a research effort unprecedented in scope and difficulty, whose ultimate goals - a cure and a vaccine - remain elusive. One of the fundamental challenges in accomplishing these goals is the tremendous genetic variability of the virus, with some genes differing at as many as 40% of nucleot...
Herpesviruses have been infecting and co-diverging with their vertebrate hosts for hundreds of millions of years. The primate simplex viruses exemplify this pattern of virus-host co-divergence, at a minimum, as far back as the most recent common ancestor of New World monkeys, Old World monkeys, and apes. Humans are the only primate species known to...
Ecological interaction networks, such as those describing the mutualistic interactions between plants and their pollinators or between plants and their frugivores, exhibit non-random structural properties that cannot be explained by simple models of network formation. One factor aff ecting the formation and eventual structure of such a network is i...
Evolutionary models that make use of site-specific parameters have recently been criticized on the grounds that parameter estimates obtained under such models can be unreliable and lack theoretical guarantees of convergence. We present a simulation study providing empirical evidence that a simple version of the models in question does exhibit sensi...
Model-based analyses of natural selection often categorize sites into a relatively small number of site classes. Forcing each site to belong to one of these classes places unrealistic constraints on the distribution of selection parameters, which can result in misleading inference due to model misspecification. We present an approximate hierarchica...
Positively selected sites in Drosophila adh found by MEME at . The FEL result column summarizes the classification obtained by FEL. stands for a positively selected site and stands for a negatively selected site (FEL ). and reflect borderline significant sites (FEL p between and ). and denote significant sites (FEL ).
(PDF)
Positively selected sites in HIV-1 viral infectivity factor (vif). stands for a positively selected site and stands for a negatively selected site (FEL ). and reflect borderline significant sites (FEL p between and ). and denote significant sites (FEL ).
(PDF)
Positively selected sites in Japanese encephalitis virus env. stands for a positively selected site and stands for a negatively selected site (FEL ). and reflect borderline significant sites (FEL p between and ). and denote significant sites (FEL ).
(PDF)
Positively selected sites in mammalian -globin. The FEL result column summarizes the classification obtained by FEL. stands for a positively selected site and stands for a negatively selected site (FEL ). and reflect borderline significant sites (FEL p between and ). and denote significant sites (FEL ).
(PDF)
Comparative performance of FEL and MEME on simulated data where does not vary among tree branches. The rate of false positives (FP) and power are reported for a fixed nominal test p-value of . Power is also shown for the p-value that achieves FP of 0.05, estimated empirically from the distribution of p-values on the subset of sites evolving neutral...
Positively selected sites in Hepatitis D virus Ag. stands for a positively selected site and stands for a negatively selected site (FEL ). and reflect borderline significant sites (FEL p between and ). and denote significant sites (FEL ).
(PDF)
Positively selected sites in abalone sperm lysin. stands for a positively selected site and stands for a negatively selected site (FEL ). and reflect borderline significant sites (FEL p between and ). and denote significant sites (FEL ).
(PDF)
Positively selected sites in HIV-1 reverse transcriptase (rt). stands for a positively selected site and stands for a negatively selected site (FEL ). and reflect borderline significant sites (FEL p between and ). and denote significant sites (FEL ).
(PDF)
Positively selected sites in Influenza A virus hemagglutinin (H3N2 serotype). Superscript letters after the site indicate the epitope in which substitutions can affect phenotype. stands for a positively selected site and stands for a negatively selected site (FEL ). and reflect borderline significant sites (FEL p between and ). and denote significa...
Author Summary
Identifying regions of protein coding genes that have undergone adaptive evolution is important to answering many questions in evolutionary biology and genetics. In order to tease out genetic evidence for natural selection, genes from a diverse array of taxa must be analyzed, only a subset of which may have undergone adaptive evoluti...
A balanced phylogeny used for simulations. Foreground branches are marked in red. See Text S1 for further simulation details.
(PDF)
False positive rates for data sets simulated under strict neutrality using empirical trees from TreeBase. The entries are sorted in order of increasing mean false positive rate derived from simulated data (10 replicates per tree). Mean divergence between any pair of leaves in a given tree is reported in expected nucleotide substitutions per site. F...
Integrase results - DEPS.
(PDF)
Protease - MEDS: Maximum likelihood parameter values for the test for episodic directional selection.
(PDF)
Reverse Transcriptase - MEDS: Maximum likelihood parameter values for the test for episodic directional selection.
(PDF)
The maximum-likelihood phylogeny for the reverse transcriptase dataset. Foreground branches are marked in red. All terminal foreground branches lead to sequences obtained from patients who had been receiving antiretroviral therapy.
(PDF)
Reverse Transcriptase - FEEDS: Maximum likelihood parameter values for the test for episodic diversifying selection.
(PDF)
Protease - FEEDS: Maximum likelihood parameter values for the test for episodic diversifying selection.
(PDF)
Integrase - MEDS: Maximum likelihood parameter values for the test for episodic directional selection.
(PDF)
The maximum-likelihood phylogeny for the integrase dataset. Foreground branches are marked in red. All terminal foreground branches lead to sequences obtained from patients who had been receiving antiretroviral therapy.
(PDF)
Protease results - DEPS.
(PDF)
Integrase - FEEDS: Maximum likelihood parameter values for the test for episodic diversifying selection.
(PDF)
Simulation details. The variation in nuisance parameters used for our simulations.
(PDF)
Author Summary
When exposed to treatment, HIV-1 and other rapidly evolving viruses have the capacity to acquire drug resistance mutations (DRAMs), which limit the efficacy of antivirals. There are a number of experimentally well characterized HIV-1 DRAMs, but many mutations whose roles are not fully understood have also been reported. In this manus...
Standard genotypic antiretroviral resistance testing, performed by bulk sequencing, does not readily detect variants that comprise <20% of the circulating HIV-1 RNA population. Nevertheless, it is valuable in selecting an antiretroviral regimen after antiretroviral failure. In patients with poor adherence, resistant variants may not reach this thre...
The 454 sequencing platform enables new investigatory methods for characterizing the diversity of a population, but is prone to sequencing errors such as homopolymer length miscalls which can lead to frame shift errors in analyses of protein coding sequences. We present an analysis methodology intended to correct such frame shift errors, distinguis...
Models of protein evolution currently come in two flavors: generalist and specialist. Generalist models (e.g. PAM, JTT, WAG) adopt a one-size-fits-all approach, where a single model is estimated from a number of different protein alignments. Specialist models (e.g. mtREV, rtREV, HIVbetween) can be estimated when a large quantity of data are availab...
Adaptive evolution frequently occurs in episodic bursts, localized to a few sites in a gene, and to a small number of lineages in a phylogenetic tree. A popular class of "branch-site" evolutionary models provides a statistical framework to search for evidence of such episodic selection. For computational tractability, current branch-site models unr...
Randomly selected Pandit data model comparisons using BIC. In each case we fitted the ECM, LCAP and GAs models to each of four randomly selected Pandit datasets. Model ranks (BIC/difference in BIC score relative to the best model) are shown.
(0.03 MB PDF)
Difference in mean (standard deviation) model mBIC scores for multi-taxon simulations. D is the average pairwise divergence; mBICn is the difference in model mBIC score between the model with n−1 rates and a more complex with n rates; P is the proportion of correctly identified models for 100 simulations. Positive mBIC scores indicate preference fo...
Qualitative comparison of structured GA models.
(0.02 MB PDF)
Codon models of evolution have facilitated the interpretation of selective forces operating on genomes. These models, however, assume a single rate of non-synonymous substitution irrespective of the nature of amino acids being exchanged. Recent developments have shown that models which allow for amino acid pairs to have independent rates of substit...
Markov models of codon substitution are powerful inferential tools for studying biological processes such as natural selection and preferences in amino acid substitution. The equilibrium character distributions of these models are almost always estimated using nucleotide frequencies observed in a sequence alignment, primarily as a matter of histori...
The single rate codon model of non-synonymous substitution is ubiquitous in phylogenetic modeling. Indeed, the use of a non-synonymous to synonymous substitution rate ratio parameter has facilitated the interpretation of selection pressure on genomes. Although the single rate model has achieved wide acceptance, we argue that the assumption of a sin...
Existing methods for the prediction of immunologically active T-cell epitopes are based on the amino acid sequence or structure of pathogen proteins. Additional information regarding the locations of epitopes may be acquired by considering the evolution of viruses in hosts with different immune backgrounds. In particular, immune-dependent evolution...
The cytotoxic T-lymphocyte immune response is important in controlling HIV-1 replication in infected humans. In this immune pathway, viral peptides within infected cells are presented to T-lymphocytes by the polymorphic human leukocyte antigens (HLA). HLA alleles exert selective pressure on the peptide regions and immune escape mutations that occur...
Over time, natural selection molds every gene into a unique mosaic of sites evolving rapidly or resisting change—an “evolutionary fingerprint” of the gene. Aspects of this evolutionary fingerprint, such as the site-specific ratio of nonsynonymous to synonymous substitution rates (dN/dS), are commonly used to identify genetic features of potential b...
Host immune responses against infectious pathogens exert strong selective pressures favouring the emergence of escape mutations that prevent immune recognition. Escape mutations within or flanking functionally conserved epitopes can occur at a significant cost to the pathogen in terms of its ability to replicate effectively. Such mutations come und...
Positive selection pressure acting on protein-coding sequences is usually inferred when the rate of nonsynonymous substitution is greater than the synonymous rate. However, purifying selection acting directly on the nucleotide sequence can lower the synonymous substitution rate. This could result in false inference of positive selection because whe...
Evidence of overlap between high omega at a codon and low dS at the synonymous sites. Positively selected sites at which a significantly low dS was observed at the synonymous sites. Positively selected sites are shown in blue vertical lines and sites with low dS are shaded in light blue.
Highly conserved regions observed at the subtype-level. dS across subtypes B and C gag and env genes showing more conserved sites at the subtype sequence level within the INS regions in gag and RRE in env.
G-A mutations in a variable region in the nef gene. Mutations observed in reference sequences in comparison to Group M ancestral sequence identified using the hypermut tool available in the Los Alamos database. The highly variable region (labeled "G-A" in Figure 3d) showed G-A mutations and is boxed in red.
Functional analysis of a novel region in the env gene. (a) Production of p24 from transfected wildtype p81 and 3 mutants produced from synonymous mutations introduced in the previously uncharacterized conserved region in env. (b) Comparison of infectivity between wildtype and the mutants.
HLA epitope maps of sites under positive selection. Positively selected sites identified using a standard diversifying selection model (D) or the toggling model (T) in (A) nef, (B) env, (C) gag, (D) pol. Sites unique to each model are shown as open circles, whereas shared sites are indicated with triangles. Optimal CTL epitopes (http://www.hiv.lanl...
Effect of tree shape on power to detect positive selection. Simulated data was used to construct ROC plots of the effects of tree shape on performance of both models. (A) Diversifying selection. (B) Positive selection (diversifying selection and toggling). (C) Amino acid toggling only.
(0.24 MB PDF)
Mapping of wild type and escape mutations to phylogeny. Mapping of codon states to terminal branches for nef site 83. Branches are colored according to codon category, c (Figure 1), Taxon labels are accession_codon. Tree is rooted with subtype B HIV-1 sequence.
(0.24 MB PDF)
Comparison of the area under the ROC curves shown in Figure 3.
(0.03 MB DOC)
Amino acid toggling and tree shape. Toggling was simulated either (A) along a random tree in which branch lengths were drawn from an exponential distribution (mean = 0.05), or (B) along an HIV-1 tree estimated from published nef sequence data [4].
(0.02 MB PDF)
Number of sites under positive selection in real HIV-1 data. Number of positively selected sites detected using a standard diversifying selection model (D) compared to a toggling model (T) for each of four HIV-1 genes; (A) pol, (B) nef, (C) gag, (D) env. Counts indicate numbers of positively selected sites identified with each method, the number of...
Probabilistic models of sequence evolution are in widespread use in phylogenetics and molecular sequence evolution. These models have become increasingly sophisticated and combined with statistical model comparison techniques have helped to shed light on how genes and proteins evolve. Models of codon evolution have been particularly useful, because...
Accurate mRNA splicing depends on multiple regulatory signals encoded in the transcribed RNA sequence. Many examples of mutations within human splice regulatory regions that alter splicing qualitatively or quantitatively have been reported and allelic differences in mRNA splicing are likely to be a common and important source of phenotypic diversit...
Length differences and frame preservation of allele-specific isoform pairs. Exon-array diagrams. Diagrams illustrating evidence of allele-specific splicing from exon-array data for 1,185 srSNPs are available from .
Allele-specific splicing candidates inferred from EST data. Exon-array diagrams. Diagrams illustrating evidence of allele-specific splicing from exon-array data for 1,185 srSNPs are available from .
Allele-specific splicing candidates inferred from exon array data. Exon-array diagrams. Diagrams illustrating evidence of allele-specific splicing from exon-array data for 1,185 srSNPs are available from .
Results of genome-wide scan of polymorphisms in splicing-regulatory regions. Exon-array diagrams. Diagrams illustrating evidence of allele-specific splicing from exon-array data for 1,185 srSNPs are available from .
Understanding how pathogens acquire resistance to drugs is important for the design of treatment strategies, particularly for rapidly evolving viruses such as HIV-1. Drug treatment can exert strong selective pressures and sites within targeted genes that confer resistance frequently evolve far more rapidly than the neutral rate. Rapid evolution at...
In a diploid organism the proportion of transcripts that are produced from the two parental alleles can differ substantially due, for example to epigenetic modification that causes complete or partial silencing of one parental allele or to cis acting polymorphisms that affect transcriptional regulation. Counts of SNP alleles derived from EST sequen...
Accurate detection of positive Darwinian selection can provide important insights to researchers investigating the evolution of pathogens. However, many pathogens (particularly viruses) undergo frequent recombination and the phylogenetic methods commonly applied to detect positive selection have been shown to give misleading results when applied to...