A tree-based conservation scoring method for short linear motifs in multiple alignments of protein sequences

EMBL Structural and Computational Biology Unit, Meyerhofstrasse 1, 69117 Heidelberg, Germany.
BMC Bioinformatics (Impact Factor: 2.58). 02/2008; 9(1):229. DOI: 10.1186/1471-2105-9-229
Source: PubMed


The structure of many eukaryotic cell regulatory proteins is highly modular. They are assembled from globular domains, segments of natively disordered polypeptides and short linear motifs. The latter are involved in protein interactions and formation of regulatory complexes. The function of such proteins, which may be difficult to define, is the aggregate of the subfunctions of the modules. It is therefore desirable to efficiently predict linear motifs with some degree of accuracy, yet sequence database searches return results that are not significant.
We have developed a method for scoring the conservation of linear motif instances. It requires only primary sequence-derived information (e.g. multiple alignment and sequence tree) and takes into account the degenerate nature of linear motif patterns. On our benchmarking, the method accurately scores 86% of the known positive instances, while distinguishing them from random matches in 78% of the cases. The conservation score is implemented as a real time application designed to be integrated into other tools. It is currently accessible via a Web Service or through a graphical interface.
The conservation score improves the prediction of linear motifs, by discarding those matches that are unlikely to be functional because they have not been conserved during the evolution of the protein sequences. It is especially useful for instances in non-structured regions of the proteins, where a domain masking filtering strategy is not applicable.

Download full-text


Available from: Rodrigo López Serrano
  • Source
    • "A keyword tree is built by a collection of several short strings [11]  "

    Preview · Article · Jan 2013
  • Source
    • "Differing conservation of PTM types within eukaryotes We comparatively studied the conservation status of the 13 PTM types as the first step for understanding their functional relations and their co-occurrence within proteins. As experimental data are not yet covering all organisms comprehensively , we assume, as implemented in other algorithms for similar purposes (Chica et al, 2008; Malik et al, 2008; Biswas et al, 2010), that the conservation of the site can be a good approximation for the conservation of the PTM. Indeed, this approach has been used to distinguish between functional and non-functional phosphorylation sites (Gnad et al, 2007; Holt et al, 2009; Tan and Bader 2012) and a less-strict criterion, the overall conservation of the proteins, was applied to determine the age of the PTMs functionality (Choudhary et al, 2009; Zielinska et al, 2010). "
    [Show abstract] [Hide abstract]
    ABSTRACT: Various post-translational modifications (PTMs) fine-tune the functions of almost all eukaryotic proteins, and co-regulation of different types of PTMs has been shown within and between a number of proteins. Aiming at a more global view of the interplay between PTM types, we collected modifications for 13 frequent PTM types in 8 eukaryotes, compared their speed of evolution and developed a method for measuring PTM co-evolution within proteins based on the co-occurrence of sites across eukaryotes. As many sites are still to be discovered, this is a considerable underestimate, yet, assuming that most co-evolving PTMs are functionally associated, we found that PTM types are vastly interconnected, forming a global network that comprise in human alone >50,000 residues in about 6000 proteins. We predict substantial PTM type interplay in secreted and membrane-associated proteins and in the context of particular protein domains and short-linear motifs. The global network of co-evolving PTM types implies a complex and intertwined post-translational regulation landscape that is likely to regulate multiple functional states of many if not all eukaryotic proteins.
    Full-text · Article · Jul 2012 · Molecular Systems Biology
  • Source
    • "However, due to the high likelihood of motifs occurring in a stochastic manner, the use of pattern matching alone produces a large number of false positive hits (6). Methods have, therefore, been developed to incorporate additional filters based on the attributes of SLiMs, including sequence conservation (11–13), structural availability (14–16), biophysical feasibility (17) and biological keywords (18). Recently, a number of de novo motif prediction tools have also emerged, capable of predicting new classes of SLiMs (19–22). "
    [Show abstract] [Hide abstract]
    ABSTRACT: The recent expansion in our knowledge of protein-protein interactions (PPIs) has allowed the annotation and prediction of hundreds of thousands of interactions. However, the function of many of these interactions remains elusive. The interactions of Eukaryotic Linear Motif (iELM) web server provides a resource for predicting the function and positional interface for a subset of interactions mediated by short linear motifs (SLiMs). The iELM prediction algorithm is based on the annotated SLiM classes from the Eukaryotic Linear Motif (ELM) resource and allows users to explore both annotated and user-generated PPI networks for SLiM-mediated interactions. By incorporating the annotated information from the ELM resource, iELM provides functional details of PPIs. This can be used in proteomic analysis, for example, to infer whether an interaction promotes complex formation or degradation. Furthermore, details of the molecular interface of the SLiM-mediated interactions are also predicted. This information is displayed in a fully searchable table, as well as graphically with the modular architecture of the participating proteins extracted from the UniProt and Phospho.ELM resources. A network figure is also presented to aid the interpretation of results. The iELM server supports single protein queries as well as large-scale proteomic submissions and is freely available at
    Full-text · Article · May 2012 · Nucleic Acids Research
Show more