Identification and analysis of evolutionarily cohesive functional modules in protein networks

The European Molecular Biology Laboratory (EMBL), 69117 Heidelberg, Germany.
Genome Research (Impact Factor: 14.63). 04/2006; 16(3):374-82. DOI: 10.1101/gr.4336406
Source: PubMed


The increasing number of sequenced genomes makes it possible to infer the evolutionary history of functional modules, i.e., groups of proteins that contribute jointly to the same cellular function in a given species. Here we identify and analyze those prokaryotic functional modules, whose composition remains largely unchanged during evolution, and study their properties. Such "cohesive" modules have a large number of internal functional connections, encode genes that tend to be in close proximity in prokaryotic genomes, and correspond to physical complexes or complex functional systems like the flagellar apparatus. Cohesive modules are enriched in processes such as energy and amino acid metabolism, cell motility, and intracellular trafficking, or secretion. By grouping genes into modules we achieve a more precise estimate of their age and find that the young modules are often horizontally transferred between species and are enriched in functions involved in interactions with the environment, implying that they play an important role in the adaptation of species to new environments.

Full-text preview

Available from:
  • Source
    • "To illustrate this point with an example, we contrast the phylogenetic profiles of members of the EGFR signaling cascade (Figure 4B, top) with six enzymes involved in heme biosynthesis (Figure 4B, bottom). Similar trends have also been highlighted by bacterial phylogenetic profiling studies (Campillos et al., 2006), strongly suggesting the existence of generalizable constraints. Biological networks evolve through the gain and loss of nodes (gene duplication and loss) and the gain, loss, and exchange of edges (new, lost, or rewired functional links between proteins). "
    [Show abstract] [Hide abstract]
    ABSTRACT: Information about functional connections between genes can be derived from patterns of coupled loss of their homologs across multiple species. This comparative approach, termed phylogenetic profiling, has been successfully used to infer genetic interactions in bacteria and eukaryotes. Rapid progress in sequencing eukaryotic species has enabled the recent phylogenetic profiling of the human genome, resulting in systematic functional predictions for uncharacterized human genes. Importantly, groups of co-evolving genes reveal widespread modularity in the underlying genetic network, facilitating experimental analyses in human cells as well as comparative studies of conserved functional modules across species. This strategy is particularly successful in identifying novel metabolic proteins and components of multi-protein complexes. The targeted sequencing of additional key eukaryotes and the incorporation of improved methods to generate and compare phylogenetic profiles will further boost the predictive power and utility of this evolutionary approach to the functional analysis of gene interaction networks.
  • Source
    • "We also confirmed that orthogroupbased profiling increases predictive power over a pairwise BBH approach restricted to genes not assigned to gene families (Figure S3A). Figure 3C plots the cumulative fraction (upper panel) and number of predicted interactions (lower panel) as a function of PCS at mp = 0.6, after filters were applied to exclude orthogroups that either appeared too recently or contained too many genes to produce useful functional predictions (Figure S3B; Supplemental Experimental Procedures). We found that that the strongest co-evolving pairs (2,101 unique genes, PCS R 10) were strongly enriched for large protein complexes, metabolic pathways, and some organelles but devoid of genes involved in canonical signaling, immune responses and transcriptional control (Reactome pathways; see Figure 3D and Table S1), closely mirroring trends in bacteria (Campillos et al., 2006). This observation argues for a generalizable tendency for cellular networks with strong internal coupling (protein complexes and metabolic pathways) to form evolutionary modules over interlinked signaling and transcriptional pathways. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Functional links between genes can be predicted using phylogenetic profiling, by correlating the appearance and loss of homologs in subsets of species. However, effective genome-wide phylogenetic profiling has been hindered by the large fraction of human genes related to each other through historical duplication events. Here, we overcame this challenge by automatically profiling over 30,000 groups of homologous human genes (orthogroups) representing the entire protein-coding genome across 177 eukaryotic species (hOP profiles). By generating a full pairwise orthogroup phylogenetic co-occurrence matrix, we derive unbiased genome-wide predictions of functional modules (hOP modules). Our approach predicts functions for hundreds of poorly characterized genes. The results suggest evolutionary constraints that lead components of protein complexes and metabolic pathways to co-evolve while genes in signaling and transcriptional networks do not. As a proof of principle, we validated two subsets of candidates experimentally for their predicted link to the actin-nucleating WASH complex and cilia/basal body function. Copyright © 2015 The Authors. Published by Elsevier Inc. All rights reserved.
    Cell Reports 02/2015; 10(6). DOI:10.1016/j.celrep.2015.01.025 · 8.36 Impact Factor
  • Source
    • "In the case of prokaryotic genomes, coevolutionary interactions between genes can be inferred from phyletic patterns by searching for co-occurrence of gene gain (resulting from horizontal gene transfer) and loss events. Several evolutionary methods to infer coevolutionary interactions from phyletic patterns exist, ranging from maximum-parsimony methods (29,30) to methods that provide explicit models of coevolution (31). Recently, we developed a probabilistic method to infer coevolutionary interactions from phyletic patterns (32). "
    [Show abstract] [Hide abstract]
    ABSTRACT: Evolutionary analysis of phyletic patterns (phylogenetic profiles) is widely used in biology, representing presence or absence of characters such as genes, restriction sites, introns, indels and methylation sites. The phyletic pattern observed in extant genomes is the result of ancestral gain and loss events along the phylogenetic tree. Here we present CoPAP (coevolution of presence-absence patterns), a user-friendly web server, which performs accurate inference of coevolving characters as manifested by co-occurring gains and losses. CoPAP uses state-of-the-art probabilistic methodologies to infer coevolution and allows for advanced network analysis and visualization. We developed a platform for comparing different algorithms that detect coevolution, which includes simulated data with pairs of coevolving sites and independent sites. Using these simulated data we demonstrate that CoPAP performance is higher than alternative methods. We exemplify CoPAP utility by analyzing coevolution among thousands of bacterial genes across 681 genomes. Clusters of coevolving genes that were detected using our method largely coincide with known biosynthesis pathways and cellular modules, thus exhibiting the capability of CoPAP to infer biologically meaningful interactions. CoPAP is freely available for use at
    Nucleic Acids Research 06/2013; 41(Web Server issue). DOI:10.1093/nar/gkt471 · 9.11 Impact Factor
Show more