Evolution of the hepatitis E virus hypervariable region

Centre for Immunity, Infection and Evolution, University of Edinburgh, Ashworth Building, King's Buildings, Edinburgh EH9 3JF, UK.
Journal of General Virology (Impact Factor: 3.18). 07/2012; 93(Pt 11):2408-18. DOI: 10.1099/vir.0.045351-0
Source: PubMed


The presence of a hypervariable (HVR) region within the genome of hepatitis E virus (HEV) remains unexplained. Previous studies have described the HVR as a proline-rich spacer between flanking functional domains of the ORF1 polyprotein. Others have proposed that the region has no function, that it reflects a hypermutable region of the virus genome, that it is derived from the insertion and evolution of host sequences or that it is subject to positive selection. This study attempts to differentiate between these explanations by documenting the evolutionary processes occurring within the HVR. We have measured the diversity of HVR sequences within acutely infected individuals or amongst sequences derived from epidemiologically linked samples and, surprisingly, find relative homogeneity amongst these datasets. We found no evidence of positive selection for amino acid substitution in the HVR. Through an analysis of published sequences, we conclude that the range of HVR diversity observed within virus genotypes can be explained by the accumulation of substitutions and, to a much lesser extent, through deletions or duplications of this region. All published HVR amino acid sequences display a relative overabundance of proline and serine residues that cannot be explained by a local bias towards cytosine in this part of the genome. Although all published HVRs contain one or more SH3-binding PxxP motifs, this motif does not occur more frequently than would be expected from the proportion of proline residues in these sequences. Taken together, these observations are consistent with the hypothesis that the HVR has a structural role that is dependent upon length and amino acid composition, rather than a specific sequence.

Download full-text


Available from: Donald B Smith
  • Source
    • "Lack of phylogenetic separation by host among HEV3 and HEV4 strains A strong phylogenetic association between HEV strains and host range is clearly established, with the HEV1 and HEV2 strains infecting humans, whereas HEV3 and HEV4 strains infect animals and humans (Purdy and Khudyakov, 2011; Krawczynski et al., 2000). However, no ancestral associations with host specificity were found among the HEV3 or HEV4 strains despite many attempts to identify a phylogenetic linkage of individual strains to the host origin (Bouquet et al., 2012a; Purdy et al., 2012b; Smith et al., 2012). Although the host-specific distribution of HEV3 subtypes was observed in a small rural community in southeastern Fig. 6. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Hepatitis E virus (HEV) causes epidemic and sporadic cases of hepatitis worldwide. HEV genotypes 3 (HEV3) and 4 (HEV4) infect humans and animals, with swine being the primary reservoir. The relevance of HEV genetic diversity to host adaptation is poorly understood. We employed a Bayesian network (BN) analysis of HEV3 and HEV4 to detect epistatic connectivity among protein sites and its association with the host specificity in each genotype. The data imply coevolution among ∼70% of polymorphic sites from all HEV proteins and association of numerous coevolving sites with adaptation to swine or humans. BN models for individual proteins and domains of the nonstructural polyprotein detected the host origin of HEV strains with accuracy of 74%-93% and 63%-87%, respectively. These findings, taken together with lack of phylogenetic association to host, suggest that the HEV host specificity is a heritable and convergent phenotypic trait achievable through variety of genetic pathways (abundance), and explain a broad host range for HEV3 and HEV4.
    Full-text · Article · Jun 2014 · Infection, genetics and evolution: journal of molecular epidemiology and evolutionary genetics in infectious diseases
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The classification of hepatitis E virus (HEV) variants is currently in transition without agreed definitions for genotypes and subtypes, or for deeper taxonomic groupings into species and genera that could incorporate more recently characterized viruses assigned to the Hepeviridae family that infect birds, bats, rodents and fish. These conflicts arise because of differences in the viruses and genomic regions compared, and in the methodology used. We have re-examined published sequence and found that synonymous substitutions were saturated in comparisons between and within virus genotypes. Analysis of complete genome sequences or concatenated ORF1/ORF2 amino acid sequences indicated that HEV variants most closely related to those infecting humans can be consistently divided into six genotypes (types 1-4 and two additional genotypes from wild boar). Variants isolated from rabbits, closely related to genotype 3, occupy an intermediate position. No consistent criteria could be defined for the assignment of virus subtypes. Analysis of amino acid sequences from these viruses with the more divergent variants from chickens, bats, and rodents in three conserved subgenomic regions (residues 1-452 or 974-1534 of ORF1, or residues 105-458 of ORF2) provided consistent support for a division into 4 groups, corresponding to HEV variants infecting humans and pigs, those infecting rats and ferrets, those from bats, and those from chickens. This approach may form the basis for a future genetic classification of HEV into four species with the more divergent HEV-like virus from fish (cutthroat trout virus) representing a second genus.
    Full-text · Article · Feb 2013 · Journal of Virology
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Model-based analyses of natural selection often categorize sites into a relatively small number of site classes. Forcing each site to belong to one of these classes places unrealistic constraints on the distribution of selection parameters, which can result in misleading inference due to model misspecification. We present an approximate hierarchical Bayesian method using a Markov chain Monte Carlo (MCMC) routine that ensures robustness against model misspecification by averaging over a large number of predefined site classes. This leaves the distribution of selection parameters essentially unconstrained, and also allows sites experiencing positive and purifying selection to be identified orders of magnitude faster than by existing methods. We demonstrate that popular random effects likelihood methods can produce misleading results when sites assigned to the same site class experience different levels of positive or purifying selection - an unavoidable scenario when using a small number of site classes. Our Fast Unconstrained Bayesian AppRoximation (FUBAR) is unaffected by this problem, while achieving higher power than existing unconstrained (fixed effects likelihood) methods. The speed advantage of FUBAR allows us to analyze larger data sets than other methods: we illustrate this on a large influenza haemagglutinin dataset (3142 sequences). FUBAR is available as a batch file within the latest HyPhy distribution (, as well as on the Datamonkey web server (
    Full-text · Article · Feb 2013 · Molecular Biology and Evolution
Show more