The Effect of Insertions and Deletions on Wirings in Protein-Protein Interaction Networks: A Large-Scale Study

School of Computing Science, Simon Fraser University, Burnaby, Canada.
Journal of computational biology: a journal of computational molecular cell biology (Impact Factor: 1.74). 03/2009; 16(2):159-67. DOI: 10.1089/cmb.2008.03TT
Source: PubMed


Although insertions and deletions (indels) are a common type of sequence variation, their origin and their functional consequences have not yet been fully understood. It has been known that indels preferably occur in the loop regions of the affected proteins. Moreover, it has recently been demonstrated that indels are significantly more strongly correlated with functional changes than substitutions. In sum, there is substantial evidence that indels, not substitutions, are the predominant evolutionary factor when it comes to structural changes in proteins. As a consequence it comes natural to hypothesize that sizable indels can modify protein interaction interfaces, causing a gain or loss of protein-protein interactions, thereby significantly rewiring the interaction networks. In this paper, we have analyzed this relationship in a large-scale study. We have computed all paralogous protein pairs in Saccharomyces cerevisiae (Yeast) and Drosophila melanogaster (Fruit Fly), and sorted the respective alignments according to whether they contained indels of significant lengths as per a pair Hidden Markov Model (HMM)-based framework of a recent study. We subsequently computed well known centrality measures for proteins that participated in indel alignments (indel proteins) and those that did not. We found that indel proteins indeed showed greater variation in terms of these measures. This demonstrates that indels have a significant influence when it comes to rewiring of the interaction networks due to evolution, which confirms our hypothesis. In general, this study may yield relevant insights into the functional interplay of proteins and the evolutionary dynamics behind it.

Download full-text


Available from: Cenk Sahinalp,
  • Source
    • "Indels, especially frame shifting insertions and deletions, are expected to have large effects on protein functions, since they may change the reading frame of a gene thus change amino acids and probably the functions of proteins. It has been shown that indels cause more severe functional changes in proteins than SNPs [8] and also have significant influence on protein-protein interaction interfaces [9]. As revealed by the Human Gene Mutation Database [3], approximately half (57%) of the human (gene sequence level) disease variations are associated with single nucleotide substitutions, and about a quarter (22%) are associated with small indels [3,10]. "
    [Show abstract] [Hide abstract]
    ABSTRACT: With the development of sequencing technologies, more and more sequence variants are available for investigation. Different classes of variants in the human genome have been identified, including single nucleotide substitutions, insertion and deletion, and large structural variations such as duplications and deletions. Insertion and deletion (indel) variants comprise a major proportion of human genetic variation. However, little is known about their effects on humans. The absence of understanding is largely due to the lack of both biological data and computational resources. This paper presents a new indel functional prediction method HMMvar based on HMM profiles, which capture the conservation information in sequences. The results demonstrate that a scoring strategy based on HMM profiles can achieve good performance in identifying deleterious or neutral variants for different data sets, and can predict the protein functional effects of both single and multiple mutations. This paper proposed a quantitative prediction method, HMMvar, to predict the effect of genetic variation using hidden Markov models. The HMM based pipeline program implementing the method HMMvar is freely available at
    BMC Bioinformatics 01/2014; 15(1):5. DOI:10.1186/1471-2105-15-5 · 2.58 Impact Factor
  • Source
    • "8) (Sarma et al. 2008; Muraki et al. 2010). The surface loops in protein sequences play important roles in mediating protein– protein interactions (Akiva et al. 2008; Singh and Gupta 2009; Hormozdiari et al. 2009). Hence, it is likely that the identified CSIs in the BchL, BchN, and BchB proteins are also involved in mediating protein–protein interaction that are specific and essential for different groups of phototrophs. "
    [Show abstract] [Hide abstract]
    ABSTRACT: The origin of photosynthesis and how this capability has spread to other bacterial phyla remain important unresolved questions. I describe here a number of conserved signature indels (CSIs) in key proteins involved in bacteriochlorophyll (Bchl) biosynthesis that provide important insights in these regards. The proteins BchL and BchX, which are essential for Bchl biosynthesis, are derived by gene duplication in a common ancestor of all phototrophs. More ancient gene duplication gave rise to the BchX-BchL proteins and the NifH protein of the nitrogenase complex. The sequence alignment of NifH-BchX-BchL proteins contain two CSIs that are uniquely shared by all NifH and BchX homologs, but not by any BchL homologs. These CSIs and phylogenetic analysis of NifH-BchX-BchL protein sequences strongly suggest that the BchX homologs are ancestral to BchL and that the Bchl-based anoxygenic photosynthesis originated prior to the chlorophyll (Chl)-based photosynthesis in cyanobacteria. Another CSI in the BchX-BchL sequence alignment that is uniquely shared by all BchX homologs and the BchL sequences from Heliobacteriaceae, but absent in all other BchL homologs, suggests that the BchL homologs from Heliobacteriaceae are primitive in comparison to all other photosynthetic lineages. Several other identified CSIs in the BchN homologs are commonly shared by all proteobacterial homologs and a clade consisting of the marine unicellular Cyanobacteria (Clade C). These CSIs in conjunction with the results of phylogenetic analyses and pair-wise sequence similarity on the BchL, BchN, and BchB proteins, where the homologs from Clade C Cyanobacteria and Proteobacteria exhibited close relationship, provide strong evidence that these two groups have incurred lateral gene transfers. Additionally, phylogenetic analyses and several CSIs in the BchL-N-B proteins that are uniquely shared by all Chlorobi and Chloroflexi homologs provide evidence that the genes for these proteins have also been laterally transferred between these groups. Other results and observations reported here indicate that the genes for the BchL-N-B proteins in Proteobacteria are derived from the Clade C Cyanobacteria, whereas those in Chlorobi were acquired from Chloroflexus or related bacteria by means of LGTs. Some implications of these observations regarding the origin and spread of photosynthesis are discussed.
    Molecular Biology and Evolution 05/2012; 29(11):3397-412. DOI:10.1093/molbev/mss145 · 9.11 Impact Factor
  • Source
    • "Conversely, genes with higher tolerance for nucleotide changes may also tolerate more indels. Although nucleotide substitution patterns in duplicate genes are well studied (Kellis et al. 2004; Brunet et al. 2006; Steinke et al. 2006), patterns of indel accumulation are poorly characterized, even though indels can lead to structural and functional divergence of homologous proteins (Reeves et al. 2006; Chan et al. 2007; Jiang and Blouin 2007; Chen et al. 2009) and thus play an important role in protein evolution (Grishin 2001; Hormozdiari et al. 2009). For example, Zhang, Wang, et al. (2010) showed that indels, as well as substitutions, are necessary to explain protein structure changes in homologous protein families. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Insertions and deletions (indels) in protein-coding genes are important sources of genetic variation. Their role in creating new proteins may be especially important after gene duplication. However, little is known about how indels affect the divergence of duplicate genes. We here study thousands of duplicate genes in five fish (teleost) species with completely sequenced genomes. The ancestor of these species has been subject to a fish-specific genome duplication (FSGD) event that occurred approximately 350 Ma. We find that duplicate genes contain at least 25% more indels than single-copy genes. These indels accumulated preferentially in the first 40 my after the FSGD. A lack of widespread asymmetric indel accumulation indicates that both members of a duplicate gene pair typically experience relaxed selection. Strikingly, we observe a 30-80% excess of deletions over insertions that is consistent for indels of various lengths and across the five genomes. We also find that indels preferentially accumulate inside loop regions of protein secondary structure and in regions where amino acids are exposed to solvent. We show that duplicate genes with high indel density also show high DNA sequence divergence. Indel density, but not amino acid divergence, can explain a large proportion of the tertiary structure divergence between proteins encoded by duplicate genes. Our observations are consistent across all five fish species. Taken together, they suggest a general pattern of duplicate gene evolution in which indels are important driving forces of evolutionary change.
    Molecular Biology and Evolution 04/2012; 29(10):3005-22. DOI:10.1093/molbev/mss108 · 9.11 Impact Factor
Show more