[Show abstract][Hide abstract] ABSTRACT: Cellular functions are mediated through complex systems of macromolecules and metabolites linked through biochemical and physical interactions, represented in interactome models as 'nodes' and 'edges', respectively. Better understanding of genotype-to-phenotype relationships in human disease will require modeling of how disease-causing mutations affect systems or interactome properties. Here we investigate how perturbations of interactome networks may differ between complete loss of gene products ('node removal') and interaction-specific or edge-specific ('edgetic') alterations. Global computational analyses of approximately 50,000 known causative mutations in human Mendelian disorders revealed clear separations of mutations probably corresponding to those of node removal versus edgetic perturbations. Experimental characterization of mutant alleles in various disorders identified diverse edgetic interaction profiles of mutant proteins, which correlated with distinct structural properties of disease proteins and disease mechanisms. Edgetic perturbations seem to confer distinct functional consequences from node removal because a large fraction of cases in which a single gene is linked to multiple disorders can be modeled by distinguishing edgetic network perturbations. Edgetic network perturbation models might improve both the understanding of dissemination of disease alleles in human populations and the development of molecular therapeutic strategies.
Molecular Systems Biology 01/2009; 5:321. · 11.34 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Understanding the consequences on host physiology induced by viral infection requires complete understanding of the perturbations caused by virus proteins on the cellular protein interaction network. The VirusMINT database (http://mint.bio.uniroma2.it/virusmint/) aims at collecting all protein interactions between viral and human proteins reported in the literature. VirusMINT currently stores over 5000 interactions involving more than 490 unique viral proteins from more than 110 different viral strains. The whole data set can be easily queried through the search pages and the results can be displayed with a graphical viewer. The curation effort has focused on manuscripts reporting interactions between human proteins and proteins encoded by some of the most medically relevant viruses: papilloma viruses, human immunodeficiency virus 1, Epstein-Barr virus, hepatitis B virus, hepatitis C virus, herpes viruses and Simian virus 40.
Nucleic Acids Research 11/2008; 37(Database issue):D669-73. · 8.28 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Impaired mitochondrial function has been implicated in the pathogenesis of type 2 diabetes, heart failure, and neurodegeneration as well as during aging. Studies with the PGC-1 transcriptional coactivators have demonstrated that these factors are central components of the regulatory network that controls mitochondrial function in mammalian cells. Here we describe a genome-wide coactivation assay to globally identify transcription factors and cofactors in this pathway. These analyses revealed a molecular signature of the PGC-1alpha transcriptional network and identified BAF60a (SMARCD1) as a molecular link between the SWI/SNF chromatin-remodeling complexes and hepatic lipid metabolism. Adenoviral-mediated expression of BAF60a stimulates fatty acid beta-oxidation in cultured hepatocytes and ameliorates hepatic steatosis in vivo. PGC-1alpha mediates the recruitment of BAF60a to PPARalpha-binding sites, leading to transcriptional activation of peroxisomal and mitochondrial fat-oxidation genes. These results define a role for the SWI/SNF complexes in the regulation of lipid homeostasis.
[Show abstract][Hide abstract] ABSTRACT: Current yeast interactome network maps contain several hundred molecular complexes with limited and somewhat controversial representation of direct binary interactions. We carried out a comparative quality assessment of current yeast interactome data sets, demonstrating that high-throughput yeast two-hybrid (Y2H) screening provides high-quality binary interaction information. Because a large fraction of the yeast binary interactome remains to be mapped, we developed an empirically controlled mapping framework to produce a "second-generation" high-quality, high-throughput Y2H data set covering approximately 20% of all yeast binary interactions. Both Y2H and affinity purification followed by mass spectrometry (AP/MS) data are of equally high quality but of a fundamentally different and complementary nature, resulting in networks with different topological and biological properties. Compared to co-complex interactome models, this binary map is enriched for transient signaling interactions and intercomplex connections with a highly significant clustering between essential proteins. Rather than correlating with essentiality, protein connectivity correlates with genetic pleiotropy.
[Show abstract][Hide abstract] ABSTRACT: Many protein-protein interactions are mediated through independently folding modular domains. Proteome-wide efforts to model protein-protein interaction or "interactome" networks have largely ignored this modular organization of proteins. We developed an experimental strategy to efficiently identify interaction domains and generated a domain-based interactome network for proteins involved in C. elegans early-embryonic cell divisions. Minimal interacting regions were identified for over 200 proteins, providing important information on their domain organization. Furthermore, our approach increased the sensitivity of the two-hybrid system, resulting in a more complete interactome network. This interactome modeling strategy revealed insights into C. elegans centrosome function and is applicable to other biological processes in this and other organisms.
[Show abstract][Hide abstract] ABSTRACT: Mitochondria are complex organelles whose dysfunction underlies a broad spectrum of human diseases. Identifying all of the proteins resident in this organelle and understanding how they integrate into pathways represent major challenges in cell biology. Toward this goal, we performed mass spectrometry, GFP tagging, and machine learning to create a mitochondrial compendium of 1098 genes and their protein expression across 14 mouse tissues. We link poorly characterized proteins in this inventory to known mitochondrial pathways by virtue of shared evolutionary history. Using this approach, we predict 19 proteins to be important for the function of complex I (CI) of the electron transport chain. We validate a subset of these predictions using RNAi, including C8orf38, which we further show harbors an inherited mutation in a lethal, infantile CI deficiency. Our results have important implications for understanding CI function and pathogenesis and, more generally, illustrate how our compendium can serve as a foundation for systematic investigations of mitochondria.
[Show abstract][Hide abstract] ABSTRACT: Rapid amplification of cDNA ends (RACE) is a widely used approach for transcript identification. Random clone selection from the RACE mixture, however, is an ineffective sampling strategy if the dynamic range of transcript abundances is large. To improve sampling efficiency of human transcripts, we hybridized the products of the RACE reaction onto tiling arrays and used the detected exons to delineate a series of reverse-transcriptase (RT)-PCRs, through which the original RACE transcript population was segregated into simpler transcript populations. We independently cloned the products and sequenced randomly selected clones. This approach, RACEarray, is superior to direct cloning and sequencing of RACE products because it specifically targets new transcripts and often results in overall normalization of transcript abundance. We show theoretically and experimentally that this strategy leads indeed to efficient sampling of new transcripts, and we investigated multiplexing the strategy by pooling RACE reactions from multiple interrogated loci before hybridization.
[Show abstract][Hide abstract] ABSTRACT: Describing the 'ORFeome' of an organism, including all major isoforms, is essential for a system-level understanding of any species; however, conventional cloning and sequencing approaches are prohibitively costly and labor-intensive. We describe a potentially genome-wide methodology for efficiently capturing new coding isoforms using reverse transcriptase (RT)-PCR recombinational cloning, 'deep-well' pooling and a next-generation sequencing platform. This ORFeome discovery pipeline will be applicable to any eukaryotic species with a sequenced genome.
[Show abstract][Hide abstract] ABSTRACT: Accurately defining the coding potential of an organism, i.e., all protein-encoding open reading frames (ORFs) or "ORFeome," is a prerequisite to fully understand its biology. ORFeome annotation involves iterative computational predictions from genome sequences combined with experimental verifications. Here we reexamine a set of Saccharomyces cerevisiae "orphan" ORFs recently removed from the original ORFeome annotation due to lack of conservation across evolutionarily related yeast species. We show that many orphan ORFs produce detectable transcripts and/or translated products in various functional genomics and proteomics experiments. By combining a naïve Bayes model that predicts the likelihood of an ORF to encode a functional product with experimental verification of strand-specific transcripts, we argue that orphan ORFs should still remain candidates for functional ORFs. In support of this model, interstrain intraspecies genome sequence variation is lower across orphan ORFs than in intergenic regions, indicating that orphan ORFs endure functional constraints and resist deleterious mutations. We conclude that ORFs should be evaluated based on multiple levels of evidence and not be removed from ORFeome annotation solely based on low sequence conservation in other species. Rather, such ORFs might be important for micro-evolutionary divergence between species.
Genome Research 08/2008; 18(8):1294-303. · 14.40 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: A proteome-wide mapping of interactions between hepatitis C virus (HCV) and human proteins was performed to provide a comprehensive view of the cellular infection. A total of 314 protein-protein interactions between HCV and human proteins was identified by yeast two-hybrid and 170 by literature mining. Integration of this data set into a reconstructed human interactome showed that cellular proteins interacting with HCV are enriched in highly central and interconnected proteins. A global analysis on the basis of functional annotation highlighted the enrichment of cellular pathways targeted by HCV. A network of proteins associated with frequent clinical disorders of chronically infected patients was constructed by connecting the insulin, Jak/STAT and TGFbeta pathways with cellular proteins targeted by HCV. CORE protein appeared as a major perturbator of this network. Focal adhesion was identified as a new function affected by HCV, mainly by NS3 and NS5A proteins.
Molecular Systems Biology 02/2008; 4:230. · 11.34 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: c 2006 by John von Neumann Institute for Computing Permission to make digital or hard copies of portions of this work for personal or classroom use is granted provided that the copies are not made or distributed for profit or commercial advantage and th at copies bear this notice and the full citation on the first page. To cop y otherwise requires prior specific permission by the publisher mention ed above.
[Show abstract][Hide abstract] ABSTRACT: Many cancer-associated genes remain to be identified to clarify the underlying molecular mechanisms of cancer susceptibility and progression. Better understanding is also required of how mutations in cancer genes affect their products in the context of complex cellular networks. Here we have used a network modeling strategy to identify genes potentially associated with higher risk of breast cancer. Starting with four known genes encoding tumor suppressors of breast cancer, we combined gene expression profiling with functional genomic and proteomic (or 'omic') data from various species to generate a network containing 118 genes linked by 866 potential functional associations. This network shows higher connectivity than expected by chance, suggesting that its components function in biologically related pathways. One of the components of the network is HMMR, encoding a centrosome subunit, for which we demonstrate previously unknown functional associations with the breast cancer-associated gene BRCA1. Two case-control studies of incident breast cancer indicate that the HMMR locus is associated with higher risk of breast cancer in humans. Our network modeling strategy should be useful for the discovery of additional cancer-associated genes.
[Show abstract][Hide abstract] ABSTRACT: The global set of relationships between protein targets of all drugs and all disease-gene products in the human protein-protein interaction or 'interactome' network remains uncharacterized. We built a bipartite graph composed of US Food and Drug Administration-approved drugs and proteins linked by drug-target binary associations. The resulting network connects most drugs into a highly interlinked giant component, with strong local clustering of drugs of similar types according to Anatomical Therapeutic Chemical classification. Topological analyses of this network quantitatively showed an overabundance of 'follow-on' drugs, that is, drugs that target already targeted proteins. By including drugs currently under investigation, we identified a trend toward more functionally diverse targets improving polypharmacology. To analyze the relationships between drug targets and disease-gene products, we measured the shortest distance between both sets of proteins in current models of the human interactome network. Significant differences in distance were found between etiological and palliative drugs. A recent trend toward more rational drug design was observed.
[Show abstract][Hide abstract] ABSTRACT: A wealth of molecular interaction data is available in the literature, ranging from large-scale datasets to a single interaction confirmed by several different techniques. These data are all too often reported either as free text or in tables of variable format, and are often missing key pieces of information essential for a full understanding of the experiment. Here we propose MIMIx, the minimum information required for reporting a molecular interaction experiment. Adherence to these reporting guidelines will result in publications of increased clarity and usefulness to the scientific community and will support the rapid, systematic capture of molecular interaction data in public databases, thereby improving access to valuable interaction data.
[Show abstract][Hide abstract] ABSTRACT: Fas-activated serine/threonine phosphoprotein (FAST) is a survival protein that is tethered to the outer mitochondrial membrane. In cells subjected to environmental stress, FAST moves to stress granules, where it interacts with TIA1 to modulate the process of stress-induced translational silencing. Both FAST and TIA1 are also found in the nucleus, where TIA1 promotes the inclusion of exons flanked by weak splice recognition sites such as exon IIIb of the fibroblast growth factor receptor 2 (FGFR2) mRNA. Two-hybrid interaction screens and biochemical analysis reveal that FAST binds to several alternative and constitutive splicing regulators, suggesting that FAST might participate in this process. The finding that FAST is concentrated at nuclear speckles also supports this contention. We show that FAST, like TIA1, promotes the inclusion of exon IIIb of the FGFR2 mRNA. Both FAST and TIA1 target a U-rich intronic sequence (IAS1) adjacent the 5' splice site of exon IIIb. However, unlike TIA1, FAST does not bind to the IAS1 sequence. Surprisingly, knockdown experiments reveal that FAST and TIA1 act independently of one another to promote the inclusion of exon IIIb. Mutational analysis reveals that FAST-mediated alternative splicing is separable from the survival effects of FAST. Our data reveal that nuclear FAST can regulate the splicing of FGFR2 transcripts.
Proceedings of the National Academy of Sciences 08/2007; 104(27):11370-5. · 9.81 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: The karyotypic chaos exhibited by human epithelial cancers complicates efforts to identify mutations critical for malignant transformation. Here we integrate complementary genomic approaches to identify human oncogenes. We show that activation of the ERK and phosphatidylinositol 3-kinase (PI3K) signaling pathways cooperate to transform human cells. Using a library of activated kinases, we identify several kinases that replace PI3K signaling and render cells tumorigenic. Whole genome structural analyses reveal that one of these kinases, IKBKE (IKKepsilon), is amplified and overexpressed in breast cancer cell lines and patient-derived tumors. Suppression of IKKepsilon expression in breast cancer cell lines that harbor IKBKE amplifications induces cell death. IKKepsilon activates the nuclear factor-kappaB (NF-kappaB) pathway in both cell lines and breast cancers. These observations suggest a mechanism for NF-kappaB activation in breast cancer, implicate the NF-kappaB pathway as a downstream mediator of PI3K, and provide a framework for integrated genomic approaches in oncogene discovery.