Protein misinteraction avoidance causes highly expressed proteins to evolve slowly.

Key Laboratory of Gene Engineering of the Ministry of Education, School of Life Sciences, Sun Yat-sen University, Guangzhou 510275, China.
Proceedings of the National Academy of Sciences (Impact Factor: 9.81). 03/2012; 109(14):E831-40. DOI: 10.1073/pnas.1117408109
Source: PubMed

ABSTRACT The tempo and mode of protein evolution have been central questions in biology. Genomic data have shown a strong influence of the expression level of a protein on its rate of sequence evolution (E-R anticorrelation), which is currently explained by the protein misfolding avoidance hypothesis. Here, we show that this hypothesis does not fully explain the E-R anticorrelation, especially for protein surface residues. We propose that natural selection against protein-protein misinteraction, which wastes functional molecules and is potentially toxic, constrains the evolution of surface residues. Because highly expressed proteins are under stronger pressures to avoid misinteraction, surface residues are expected to show an E-R anticorrelation. Our molecular-level evolutionary simulation and yeast genomic analysis confirm multiple predictions of the hypothesis. These findings show a pluralistic origin of the E-R anticorrelation and reveal the role of protein misinteraction, an inherent property of complex cellular systems, in constraining protein evolution.

1 Bookmark
  • [Show abstract] [Hide abstract]
    ABSTRACT: The extent that both positive and negative selection vary across different portions of plant genomes remains poorly understood. Here, we sequence whole genomes of 13 Capsella grandiflora individuals and quantify the amount of selection across the genome. Using an estimate of the distribution of fitness effects, we show that selection is strong in coding regions, but weak in most noncoding regions, with the exception of 5' and 3' untranslated regions (UTRs). However, estimates of selection on noncoding regions conserved across the Brassicaceae family show strong signals of selection. Additionally, we see reductions in neutral diversity around functional substitutions in both coding and conserved noncoding regions, indicating recent selective sweeps at these sites. Finally, using expression data from leaf tissue we show that genes that are more highly expressed experience stronger negative selection but comparable levels of positive selection to lowly expressed genes. Overall, we observe widespread positive and negative selection in coding and regulatory regions, but our results also suggest that both positive and negative selection on plant noncoding sequence are considerably rarer than in animal genomes.
    PLoS Genetics 09/2014; 10(9):e1004622. · 8.17 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The study of molecular evolution at the level of protein-coding genes often entails comparing large datasets of sequences to infer their evolutionary relationships. Despite the importance of a protein's structure and conformational dynamics to its function and thus its fitness, common phylogenetic methods embody minimal biophysical knowledge of proteins. To underscore the biophysical constraints on natural selection, we survey effects of protein mutations, highlighting the physical basis for marginal stability of natural globular proteins and how requirement for kinetic stability and avoidance of misfolding and misinteractions might have affected protein evolution. The biophysical underpinnings of these effects have been addressed by models with an explicit coarse-grained spatial representation of the polypeptide chain. Sequence–structure mappings based on such models are powerful conceptual tools that rationalize mutational robustness, evolvability, epistasis, promiscuous function performed by ‘hidden’ conformational states, resolution of adaptive conflicts and conformational switches in the evolution from one protein fold to another. Recently, protein biophysics has been applied to derive more accurate evolutionary accounts of sequence data. Methods have also been developed to exploit sequence-based evolutionary information to predict biophysical behaviours of proteins. The success of these approaches demonstrates a deep synergy between the fields of protein biophysics and protein evolution.
    Journal of The Royal Society Interface 08/2014; 11(100):20140419. · 3.86 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: There are two strong and equally important predictors of rates of human protein evolution: the amount the gene is expressed and the proportion of exonic sequence devoted to control splicing, mediated largely by selection on exonic splice enhancer (ESE) motifs. Is the same true for non-coding RNAs, known to be under very weak purifying selection? Prior evidence suggests that selection at splice sites in lincRNAs is important. We now report multiple lines of evidence indicating that the great majority of purifying selection operating on lincRNAs in humans is splice-related. Splice related parameters explain much of the between-gene variation in evolutionary rate in humans. Expression rate is not a relevant predictor, although expression breadth is weakly so. In contrast to protein coding RNAs, we observe no relationship between evolutionary rate and lincRNA stability. As in protein coding genes, ESEs are especially abundant near splice junctions and evolve slower than non-ESE sequence equidistant from boundaries. Nearly all constraint in lincRNAs is at exon ends (N.B. the same is not witnessed in Drosophila). While we cannot definitely answer the question as to why splice-related selection is so important, we find no evidence that splicing might enable the nonsense mediated decay pathway to capture transcripts incorrectly processed by ribosomes. We find evidence consistent with the notion that splicing modifies the underlying chromatin through recruitment of splice-coupled chromatin modifiers, such as CHD1, which in turn might modulate neighbour gene activity. We conclude that most selection on human lincRNAs is splice mediated and suggest that the possibility of splice-chromatin coupling is worthy of further scrutiny.
    Molecular Biology and Evolution 08/2014; · 14.31 Impact Factor

Full-text (2 Sources)

Available from
Jun 5, 2014