Evolutionary and structural feedback on selection of sequences for comparative analysis of proteins

Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas, USA.
Proteins Structure Function and Bioinformatics (Impact Factor: 2.63). 04/2006; 63(1):87-99. DOI: 10.1002/prot.20866
Source: PubMed


It has been noted that slowly evolving protein residues have two properties: (a) they tend to cluster in the native fold, and (b) they delineate functional surfaces-parts of the surface through which the protein interacts with other proteins or small ligands. Herein, we demonstrate that the two are coupled sufficiently strongly that one effect, when observed, statistically implies the other. Detection of both can be accomplished in multiple sequence alignment related methods by the careful selection of relevant sequences. For the demonstration, we use two sets of protein families: a small set of diverse proteins with diverse functional surfaces, and a large set of homodimerizing enzymes. A practical outcome of our considerations is a simple prescriptive rule for the selection of homologous sequences for the comparative analysis of proteins: in order to optimize the detection of (potentially unknown) functional surfaces, it is sufficient to select sequences in such a way that the residues observed at any level of evolutionary divergence, as implied by the alignment, cluster on the folded protein.

4 Reads
  • Source
    • "For commercial re-use, please contact functional sites (Mihalek et al., 2006a, b; Wilkins et al., 2010). In that light, the clustering of these residues simply reflects the fundamental epistatic coupling of neighbors. "
    [Show abstract] [Hide abstract]
    ABSTRACT: The constraints under which sequence, structure, and function co-evolve are not fully understood. Bringing this mutual relationship to light can reveal the molecular basis of binding, catalysis and allostery, thereby identifying function and rationally guiding protein redesign. Underlying these relationships are the epistatic interactions that occur when the consequences of a mutation to a protein are determined by the genetic background in which it occurs. Based on prior data, we hypothesize that epistatic forces operate most strongly between residues nearby in the structure, resulting in smooth evolutionary importance across the structure.Methods and RESULTS: We find that when residue scores of evolutionary importance are distributed smoothly between nearby residues, functional site prediction accuracy improves. Accordingly, we designed a novel measure of evolutionary importance that focuses on the interaction between pairs of structurally neighboring residues. This measure that we term pair-interaction Evolutionary Trace (piET) yields greater functional site overlap and better structure-based proteome-wide functional predictions.Conclusions: Our data show that the structural smoothness of evolutionary importance is a fundamental feature of the co-evolution of sequence, structure, and function. Mutations operate on individual residues, but selective pressure depends in part on the extent to which a mutation perturbs interactions with neighboring residues. In practice, this principle led us to redefine the importance of a residue in terms of the importance of its epistatic interactions with neighbors, yielding better annotation of functional residues, motivating experimental validation of a novel functional site in LexA, and refining protein function prediction.
    Bioinformatics 09/2013; 29(21). DOI:10.1093/bioinformatics/btt489 · 4.98 Impact Factor
  • Source
    • "Fold recognition programs may not provide an answer to the problem, since the functional similarity to the structure-known protein is not considered in these programs. Considering the functional differences by an ETA-based strategy [82], residue clustering measures [85, 86] or DSPAC [84] can be used to solve the problem, since these can be extended to identify the set of sequences that shares the same or similar biochemical functions. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Structural genomics projects have solved many new structures with unknown functions. One strategy to investigate the function of a structure is to computationally find the functionally important residues or regions on it. Therefore, the development of functional region prediction methods has become an important research subject. An effective approach is to use a method employing structural and evolutionary information, such as the evolutionary trace (ET) method. ET ranks the residues of a protein structure by calculating the scores for relative evolutionary importance, and locates functionally important sites by identifying spatial clusters of highly ranked residues. After ET was developed, numerous ET-like methods were subsequently reported, and many of them are in practical use, although they require certain conditions. In this mini review, we first introduce the remaining problems and the recent improvements in the methods using structural and evolutionary information. We then summarize the recent developments of the methods. Finally, we conclude by describing possible extensions of the evolution- and structure-based methods.
    Computational and Structural Biotechnology Journal 08/2013; 8(11):e201308007. DOI:10.5936/csbj.201308007
  • Source
    • "Mihalek et al. [11] also proposed ‘residue clustering measure’ to indicate the appropriateness of the homologous sequences for functional region prediction. This measure essentially quantifies the degree of clustering of the evolutionarily important residues in the tertiary structure of the protein. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Due to the advent of high throughput sequencing techniques and structural genomic projects, the number of gene and protein sequences has been ever increasing. Computational methods to annotate these genes and proteins are even more indispensable. Proteins are important macromolecules and study of the function of proteins is an important problem in structural bioinformatics. This paper discusses a number of methods to predict protein functional site especially focusing on protein ligand binding site prediction. Initially, a short overview is presented on recent advances in methods for selection of homologous sequences. Furthermore, a few recent structural based approaches and sequence-and-structure based approaches for protein functional sites are discussed in details.
    Computational and Structural Biotechnology Journal 08/2013; 8(11):e201308005. DOI:10.5936/csbj.201308005
Show more