[Show abstract][Hide abstract] ABSTRACT: Physical interactions between proteins play a key role in probably every cellular process. Efforts to chart the protein interaction networks are ongoing in a number of model organisms using a diversity of approaches. The resulting genome-wide interaction maps will provide a scaffold for further detailed functional analysis. We developed MAPPIT, a mammalian two-hybrid approach that allows identification and analysis of mammalian protein-protein interactions in their native environment. Here, we introduce an efficient MAPPIT assay that permits high-throughput screening of arrayed collections of proteins and complements a previously published cDNA library screening approach. We validated both methods in screens for interaction partners of the Cullin-based E3 ubiquitin ligase subunits SKP1 and Elongin C. In addition to a number of known interactors, novel SKP1 and Elongin C binding proteins were identified. The array assay is an important addition to the MAPPIT suite of technologies that is expected to significantly increase its utility as a toolbox to screen for novel interactors of proteins or small molecules.
Journal of Proteome Research 02/2009; 8(2):877-86. · 5.06 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: To provide accurate biological hypotheses and elucidate global properties of cellular networks, systematic identification of protein-protein interactions must meet high quality standards.We present an expanded C. elegans protein-protein interaction network, or 'interactome' map, derived from testing a matrix of approximately 10,000 x approximately 10,000 proteins using a highly specific, high-throughput yeast two-hybrid system. Through a new empirical quality control framework, we show that the resulting data set (Worm Interactome 2007, or WI-2007) was similar in quality to low-throughput data curated from the literature. We filtered previous interaction data sets and integrated them with WI-2007 to generate a high-confidence consolidated map (Worm Interactome version 8, or WI8). This work allowed us to estimate the size of the worm interactome at approximately 116,000 interactions. Comparison with other types of functional genomic data shows the complementarity of distinct experimental approaches in predicting different functional relationships between genes or proteins
[Show abstract][Hide abstract] ABSTRACT: Information on protein-protein interactions is of central importance for many areas of biomedical research. At present no method exists to systematically and experimentally assess the quality of individual interactions reported in interaction mapping experiments. To provide a standardized confidence-scoring method that can be applied to tens of thousands of protein interactions, we have developed an interaction tool kit consisting of four complementary, high-throughput protein interaction assays. We benchmarked these assays against positive and random reference sets consisting of well documented pairs of interacting human proteins and randomly chosen protein pairs, respectively. A logistic regression model was trained using the data from these reference sets to combine the assay outputs and calculate the probability that any newly identified interaction pair is a true biophysical interaction once it has been tested in the tool kit. This general approach will allow a systematic and empirical assignment of confidence scores to all individual protein-protein interactions in interactome networks.
[Show abstract][Hide abstract] ABSTRACT: Several attempts have been made to systematically map protein-protein interaction, or 'interactome', networks. However, it remains difficult to assess the quality and coverage of existing data sets. Here we describe a framework that uses an empirically-based approach to rigorously dissect quality parameters of currently available human interactome maps. Our results indicate that high-throughput yeast two-hybrid (HT-Y2H) interactions for human proteins are more precise than literature-curated interactions supported by a single publication, suggesting that HT-Y2H is suitable to map a significant portion of the human interactome. We estimate that the human interactome contains approximately 130,000 binary interactions, most of which remain to be mapped. Similar to estimates of DNA sequence data quality and genome size early in the Human Genome Project, estimates of protein interaction data quality and interactome size are crucial to establish the magnitude of the task of comprehensive human interactome mapping and to elucidate a path toward this goal.
[Show abstract][Hide abstract] ABSTRACT: Cellular functions are mediated through complex systems of macromolecules and metabolites linked through biochemical and physical interactions, represented in interactome models as 'nodes' and 'edges', respectively. Better understanding of genotype-to-phenotype relationships in human disease will require modeling of how disease-causing mutations affect systems or interactome properties. Here we investigate how perturbations of interactome networks may differ between complete loss of gene products ('node removal') and interaction-specific or edge-specific ('edgetic') alterations. Global computational analyses of approximately 50,000 known causative mutations in human Mendelian disorders revealed clear separations of mutations probably corresponding to those of node removal versus edgetic perturbations. Experimental characterization of mutant alleles in various disorders identified diverse edgetic interaction profiles of mutant proteins, which correlated with distinct structural properties of disease proteins and disease mechanisms. Edgetic perturbations seem to confer distinct functional consequences from node removal because a large fraction of cases in which a single gene is linked to multiple disorders can be modeled by distinguishing edgetic network perturbations. Edgetic network perturbation models might improve both the understanding of dissemination of disease alleles in human populations and the development of molecular therapeutic strategies.
Molecular Systems Biology 01/2009; 5:321. · 11.34 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Understanding the consequences on host physiology induced by viral infection requires complete understanding of the perturbations caused by virus proteins on the cellular protein interaction network. The VirusMINT database (http://mint.bio.uniroma2.it/virusmint/) aims at collecting all protein interactions between viral and human proteins reported in the literature. VirusMINT currently stores over 5000 interactions involving more than 490 unique viral proteins from more than 110 different viral strains. The whole data set can be easily queried through the search pages and the results can be displayed with a graphical viewer. The curation effort has focused on manuscripts reporting interactions between human proteins and proteins encoded by some of the most medically relevant viruses: papilloma viruses, human immunodeficiency virus 1, Epstein-Barr virus, hepatitis B virus, hepatitis C virus, herpes viruses and Simian virus 40.
Nucleic Acids Research 11/2008; 37(Database issue):D669-73. · 8.81 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Current yeast interactome network maps contain several hundred molecular complexes with limited and somewhat controversial representation of direct binary interactions. We carried out a comparative quality assessment of current yeast interactome data sets, demonstrating that high-throughput yeast two-hybrid (Y2H) screening provides high-quality binary interaction information. Because a large fraction of the yeast binary interactome remains to be mapped, we developed an empirically controlled mapping framework to produce a "second-generation" high-quality, high-throughput Y2H data set covering approximately 20% of all yeast binary interactions. Both Y2H and affinity purification followed by mass spectrometry (AP/MS) data are of equally high quality but of a fundamentally different and complementary nature, resulting in networks with different topological and biological properties. Compared to co-complex interactome models, this binary map is enriched for transient signaling interactions and intercomplex connections with a highly significant clustering between essential proteins. Rather than correlating with essentiality, protein connectivity correlates with genetic pleiotropy.
[Show abstract][Hide abstract] ABSTRACT: Many protein-protein interactions are mediated through independently folding modular domains. Proteome-wide efforts to model protein-protein interaction or "interactome" networks have largely ignored this modular organization of proteins. We developed an experimental strategy to efficiently identify interaction domains and generated a domain-based interactome network for proteins involved in C. elegans early-embryonic cell divisions. Minimal interacting regions were identified for over 200 proteins, providing important information on their domain organization. Furthermore, our approach increased the sensitivity of the two-hybrid system, resulting in a more complete interactome network. This interactome modeling strategy revealed insights into C. elegans centrosome function and is applicable to other biological processes in this and other organisms.
[Show abstract][Hide abstract] ABSTRACT: Impaired mitochondrial function has been implicated in the pathogenesis of type 2 diabetes, heart failure, and neurodegeneration as well as during aging. Studies with the PGC-1 transcriptional coactivators have demonstrated that these factors are central components of the regulatory network that controls mitochondrial function in mammalian cells. Here we describe a genome-wide coactivation assay to globally identify transcription factors and cofactors in this pathway. These analyses revealed a molecular signature of the PGC-1alpha transcriptional network and identified BAF60a (SMARCD1) as a molecular link between the SWI/SNF chromatin-remodeling complexes and hepatic lipid metabolism. Adenoviral-mediated expression of BAF60a stimulates fatty acid beta-oxidation in cultured hepatocytes and ameliorates hepatic steatosis in vivo. PGC-1alpha mediates the recruitment of BAF60a to PPARalpha-binding sites, leading to transcriptional activation of peroxisomal and mitochondrial fat-oxidation genes. These results define a role for the SWI/SNF complexes in the regulation of lipid homeostasis.
[Show abstract][Hide abstract] ABSTRACT: Describing the 'ORFeome' of an organism, including all major isoforms, is essential for a system-level understanding of any species; however, conventional cloning and sequencing approaches are prohibitively costly and labor-intensive. We describe a potentially genome-wide methodology for efficiently capturing new coding isoforms using reverse transcriptase (RT)-PCR recombinational cloning, 'deep-well' pooling and a next-generation sequencing platform. This ORFeome discovery pipeline will be applicable to any eukaryotic species with a sequenced genome.
[Show abstract][Hide abstract] ABSTRACT: Accurately defining the coding potential of an organism, i.e., all protein-encoding open reading frames (ORFs) or "ORFeome," is a prerequisite to fully understand its biology. ORFeome annotation involves iterative computational predictions from genome sequences combined with experimental verifications. Here we reexamine a set of Saccharomyces cerevisiae "orphan" ORFs recently removed from the original ORFeome annotation due to lack of conservation across evolutionarily related yeast species. We show that many orphan ORFs produce detectable transcripts and/or translated products in various functional genomics and proteomics experiments. By combining a naïve Bayes model that predicts the likelihood of an ORF to encode a functional product with experimental verification of strand-specific transcripts, we argue that orphan ORFs should still remain candidates for functional ORFs. In support of this model, interstrain intraspecies genome sequence variation is lower across orphan ORFs than in intergenic regions, indicating that orphan ORFs endure functional constraints and resist deleterious mutations. We conclude that ORFs should be evaluated based on multiple levels of evidence and not be removed from ORFeome annotation solely based on low sequence conservation in other species. Rather, such ORFs might be important for micro-evolutionary divergence between species.
Genome Research 08/2008; 18(8):1294-303. · 14.40 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Mitochondria are complex organelles whose dysfunction underlies a broad spectrum of human diseases. Identifying all of the proteins resident in this organelle and understanding how they integrate into pathways represent major challenges in cell biology. Toward this goal, we performed mass spectrometry, GFP tagging, and machine learning to create a mitochondrial compendium of 1098 genes and their protein expression across 14 mouse tissues. We link poorly characterized proteins in this inventory to known mitochondrial pathways by virtue of shared evolutionary history. Using this approach, we predict 19 proteins to be important for the function of complex I (CI) of the electron transport chain. We validate a subset of these predictions using RNAi, including C8orf38, which we further show harbors an inherited mutation in a lethal, infantile CI deficiency. Our results have important implications for understanding CI function and pathogenesis and, more generally, illustrate how our compendium can serve as a foundation for systematic investigations of mitochondria.
[Show abstract][Hide abstract] ABSTRACT: Rapid amplification of cDNA ends (RACE) is a widely used approach for transcript identification. Random clone selection from the RACE mixture, however, is an ineffective sampling strategy if the dynamic range of transcript abundances is large. To improve sampling efficiency of human transcripts, we hybridized the products of the RACE reaction onto tiling arrays and used the detected exons to delineate a series of reverse-transcriptase (RT)-PCRs, through which the original RACE transcript population was segregated into simpler transcript populations. We independently cloned the products and sequenced randomly selected clones. This approach, RACEarray, is superior to direct cloning and sequencing of RACE products because it specifically targets new transcripts and often results in overall normalization of transcript abundance. We show theoretically and experimentally that this strategy leads indeed to efficient sampling of new transcripts, and we investigated multiplexing the strategy by pooling RACE reactions from multiple interrogated loci before hybridization.
[Show abstract][Hide abstract] ABSTRACT: A proteome-wide mapping of interactions between hepatitis C virus (HCV) and human proteins was performed to provide a comprehensive view of the cellular infection. A total of 314 protein-protein interactions between HCV and human proteins was identified by yeast two-hybrid and 170 by literature mining. Integration of this data set into a reconstructed human interactome showed that cellular proteins interacting with HCV are enriched in highly central and interconnected proteins. A global analysis on the basis of functional annotation highlighted the enrichment of cellular pathways targeted by HCV. A network of proteins associated with frequent clinical disorders of chronically infected patients was constructed by connecting the insulin, Jak/STAT and TGFbeta pathways with cellular proteins targeted by HCV. CORE protein appeared as a major perturbator of this network. Focal adhesion was identified as a new function affected by HCV, mainly by NS3 and NS5A proteins.
Molecular Systems Biology 02/2008; 4:230. · 11.34 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: c 2006 by John von Neumann Institute for Computing Permission to make digital or hard copies of portions of this work for personal or classroom use is granted provided that the copies are not made or distributed for profit or commercial advantage and th at copies bear this notice and the full citation on the first page. To cop y otherwise requires prior specific permission by the publisher mention ed above.