Protein misinteraction avoidance causes highly expressed proteins to evolve slowly

Key Laboratory of Gene Engineering of the Ministry of Education, School of Life Sciences, Sun Yat-sen University, Guangzhou 510275, China.
Proceedings of the National Academy of Sciences (Impact Factor: 9.67). 03/2012; 109(14):E831-40. DOI: 10.1073/pnas.1117408109
Source: PubMed


The tempo and mode of protein evolution have been central questions in biology. Genomic data have shown a strong influence of the expression level of a protein on its rate of sequence evolution (E-R anticorrelation), which is currently explained by the protein misfolding avoidance hypothesis. Here, we show that this hypothesis does not fully explain the E-R anticorrelation, especially for protein surface residues. We propose that natural selection against protein-protein misinteraction, which wastes functional molecules and is potentially toxic, constrains the evolution of surface residues. Because highly expressed proteins are under stronger pressures to avoid misinteraction, surface residues are expected to show an E-R anticorrelation. Our molecular-level evolutionary simulation and yeast genomic analysis confirm multiple predictions of the hypothesis. These findings show a pluralistic origin of the E-R anticorrelation and reveal the role of protein misinteraction, an inherent property of complex cellular systems, in constraining protein evolution.

Download full-text


Available from: Jianrong Yang
  • Source
    • "Based on Correspondence Analysis (CA) results, we observed the universal rule that functional factors (ESS and PPA) and transcriptional abundance (CAI and EL) were roughly grouped together, opposing the ERs in the second principal component (PC2, see Methods) (Additional file 1: Figure S1, Figure 4). Evolutionary constraints on highly transcribed proteins might prevent misfolding [7] or misinteraction [23]. This can hamper functionality and even potentially produce a large quantity of toxic proteins. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Despite rapid progress in understanding the mechanisms that shape the evolution of proteins, the relative importance of various factors remain to be elucidated. In this study, we have assessed the effects of 16 different biological features on the evolutionary rates (ERs) of protein-coding sequences in bacterial genomes. Our analysis of 18 bacterial species revealed new correlations between ERs and constraining factors. Previous studies have suggested that transcriptional abundance overwhelmingly constrains the evolution of yeast protein sequences. This transcriptional abundance leads to selection against misfolding or misinteractions. In this study we found that there was no single factor in determining the evolution of bacterial proteins. Not only transcriptional abundance (codon adaptation index and expression level), but also protein-protein associations (PPAs), essentiality (ESS), subcellular localization of cytoplasmic membrane (SLM), transmembrane helices (TMH) and hydropathicity score (HS) independently and significantly affected the ERs of bacterial proteins. In some species, PPA and ESS demonstrate higher correlations with ER than transcriptional abundance. Different forces drive the evolution of protein sequences in yeast and bacteria. In bacteria, the constraints are involved in avoiding a build-up of toxic molecules caused by misfolding/misinteraction (transcriptional abundance), while retaining important functions (ESS, PPA) and maintaining the cell membrane (SLM, TMH and HS). Each of these independently contributes to the variation in protein evolution.
    Full-text · Article · Aug 2013 · BMC Evolutionary Biology
  • Source
    • "It is reasonable to assume that proteins having numerous binding partners experience stronger selective constraints than other proteins having fewer binding partners, all other aspects being equal. Therefore it is reasonable to expect a network's hub to evolve more slowly than the nodes of the network [82,83]. As such, we may expect that evolutionary innovations proceed through mutations in the less-connected nodes of the network (e.g. as demonstrated by Jeong et al. [41], that highly connected proteins are more essential for survival than fewer connected proteins). "
    [Show abstract] [Hide abstract]
    ABSTRACT: The modern synthesis of evolutionary theory and genetics has enabled us to discover underlying molecular mechanisms of organismal evolution. We know that in order to maximize an organism's fitness in a particular environment, individual interactions among components of protein and nucleic acid networks need to be optimized by natural selection, or sometimes through random processes, as the organism responds to changes and/or challenges in the environment. Despite the significant role of molecular networks in determining an organism's adaptation to its environment, we still do not know how such inter- and intra-molecular interactions within networks change over time and contribute to an organism's evolvability while maintaining overall network functions. One way to address this challenge is to identify connections between molecular networks and their host organisms, to manipulate these connections, and then attempt to understand how such perturbations influence molecular dynamics of the network and thus influence evolutionary paths and organismal fitness. In the present review, we discuss how integrating evolutionary history with experimental systems that combine tools drawn from molecular evolution, synthetic biology and biochemistry allow us to identify the underlying mechanisms of organismal evolution, particularly from the perspective of protein interaction networks.
    Full-text · Article · Jul 2013 · Biochemical Journal
  • Source
    • "It is well known that there is a strong negative correlation between the expression level of a protein and its rate of evolution [36,37]. This relationship is currently explained by protein misfolding [37,38] and misinteraction avoidances [39]. Our analyses indicate that ribosomal usage efficiency may also be a relevant factor that determines gene the evolution of coding sequences in human genes and also possibly in other vertebrates. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Background Gene expression is one of the most relevant biological processes of living cells. Due to the relative small population sizes, it is predicted that human gene sequences are not strongly influenced by selection towards expression efficiency. One of the major problems in estimating to what extent gene characteristics can be selected to maximize expression efficiency is the wide variation that exists in RNA and protein levels among physiological states and different tissues. Analyses of datasets of stably expressed genes (i.e. with consistent expression between physiological states and tissues) would provide more accurate and reliable measurements of associations between variations of a specific gene characteristic and expression, and how distinct gene features work to optimize gene expression. Results Using a dataset of human genes with consistent expression between physiological states we selected gene sequence signatures related to translation that can predict about 42% of mRNA variation. The prediction can be increased to 51% when selecting genes that are stably expressed in more than 1 tissue. These genes are enriched for translation and ribosome biosynthesis processes and have higher translation efficiency scores, smaller coding sequences and 3′ UTR sizes and lower folding energies when compared to other datasets. Additionally, the amino acid frequencies weighted by expression showed higher correlations with isoacceptor tRNA gene copy number, and smaller absolute correlation values with biosynthetic costs. Conclusion Our results indicate that human gene sequence characteristics related to transcription and translation processes can co-evolve in an integrated manner in order to optimize gene expression.
    Full-text · Article · Apr 2013 · BMC Genomics
Show more