The effect of insertions and deletions on wirings in protein-protein interaction networks: a large-scale study.

School of Computing Science, Simon Fraser University, Burnaby, Canada.
Journal of computational biology: a journal of computational molecular cell biology (Impact Factor: 1.69). 03/2009; 16(2):159-67. DOI: 10.1089/cmb.2008.03TT
Source: PubMed

ABSTRACT Although insertions and deletions (indels) are a common type of sequence variation, their origin and their functional consequences have not yet been fully understood. It has been known that indels preferably occur in the loop regions of the affected proteins. Moreover, it has recently been demonstrated that indels are significantly more strongly correlated with functional changes than substitutions. In sum, there is substantial evidence that indels, not substitutions, are the predominant evolutionary factor when it comes to structural changes in proteins. As a consequence it comes natural to hypothesize that sizable indels can modify protein interaction interfaces, causing a gain or loss of protein-protein interactions, thereby significantly rewiring the interaction networks. In this paper, we have analyzed this relationship in a large-scale study. We have computed all paralogous protein pairs in Saccharomyces cerevisiae (Yeast) and Drosophila melanogaster (Fruit Fly), and sorted the respective alignments according to whether they contained indels of significant lengths as per a pair Hidden Markov Model (HMM)-based framework of a recent study. We subsequently computed well known centrality measures for proteins that participated in indel alignments (indel proteins) and those that did not. We found that indel proteins indeed showed greater variation in terms of these measures. This demonstrates that indels have a significant influence when it comes to rewiring of the interaction networks due to evolution, which confirms our hypothesis. In general, this study may yield relevant insights into the functional interplay of proteins and the evolutionary dynamics behind it.

  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Insertions/deletions (indels) in protein sequences are useful as drug targets, protein structure predictors, species diagnostics and evolutionary markers. However there is limited understanding of indel evolutionary patterns. We sought to characterize indel patterns focusing first on the major groups of multicellular eukaryotes. Comparisons of complete proteomes from a taxonically broad set of primarily Metazoa, Fungi and Viridiplantae yielded 299 substantial (>250aa) universal, single-copy (in-paralog only) proteins, from which 901 simple (present/absent) and 3,806 complex (multistate) indels were extracted. Simple indels are mostly small (1-7aa) with a most frequent size class of 1aa. However, even these simple looking indels show a surprisingly high level of hidden homoplasy (multiple independent origins). Among the apparently homoplasy-free simple indels, we identify 69 potential clade-defining indels (CDIs) that may warrant closer examination. CDIs show a very uneven taxonomic distribution among Viridiplante (13 CDIs), Fungi (40 CDIs), and Metazoa (0 CDIs). An examination of singleton indels shows an excess of insertions over deletions in nearly all examined taxa. This excess averages 2.31 overall, with a maximum observed value of 7.5 fold. We find considerable potential for identifying taxon-marker indels using an automated pipeline. However, it appears that simple indels in universal proteins are too rare and homoplasy-rich to be used for pure indel-based phylogeny. The excess of insertions over deletions seen in nearly every genome and major group examined maybe useful in defining more realistic gap penalties for sequence alignment. This bias also suggests that insertions in highly conserved proteins experience less purifying selection than do deletions.
    BMC Evolutionary Biology 07/2013; 13(1):140. · 3.41 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The present review focuses on the evolution of proteins and the impact of amino acid mutations on function from a structural perspective. Proteins evolve under the law of natural selection and undergo alternating periods of conservative evolution and of relatively rapid change. The likelihood of mutations being fixed in the genome depends on various factors, such as the fitness of the phenotype or the position of the residues in the three-dimensional structure. For example, co-evolution of residues located close together in three-dimensional space can occur to preserve global stability. Whereas point mutations can fine-tune the protein function, residue insertions and deletions ('decorations' at the structural level) can sometimes modify functional sites and protein interactions more dramatically. We discuss recent developments and tools to identify such episodic mutations, and examine their applications in medical research. Such tools have been tested on simulated data and applied to real data such as viruses or animal sequences. Traditionally, there has been little if any cross-talk between the fields of protein biophysics, protein structure-function and molecular evolution. However, the last several years have seen some exciting developments in combining these approaches to obtain an in-depth understanding of how proteins evolve. For example, a better understanding of how structural constraints affect protein evolution will greatly help us to optimize our models of sequence evolution. The present review explores this new synthesis of perspectives.
    Biochemical Journal 02/2013; 449(3):581-94. · 4.78 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: With the development of sequencing technologies, more and more sequence variants are available for investigation. Different classes of variants in the human genome have been identified, including single nucleotide substitutions, insertion and deletion, and large structural variations such as duplications and deletions. Insertion and deletion (indel) variants comprise a major proportion of human genetic variation. However, little is known about their effects on humans. The absence of understanding is largely due to the lack of both biological data and computational resources. This paper presents a new indel functional prediction method HMMvar based on HMM profiles, which capture the conservation information in sequences. The results demonstrate that a scoring strategy based on HMM profiles can achieve good performance in identifying deleterious or neutral variants for different data sets, and can predict the protein functional effects of both single and multiple mutations. This paper proposed a quantitative prediction method, HMMvar, to predict the effect of genetic variation using hidden Markov models. The HMM based pipeline program implementing the method HMMvar is freely available at
    BMC Bioinformatics 01/2014; 15(1):5. · 2.67 Impact Factor

Full-text (2 Sources)

Available from
May 28, 2014