[show abstract][hide abstract] ABSTRACT: With the ability to fully sequence tumor genomes/exomes, the quest for cancer driver genes can now be undertaken in an unbiased manner. However, obtaining a complete catalog of cancer genes is difficult due to the heterogeneous molecular nature of the disease and the limitations of available computational methods. Here we show that the combination of complementary methods allows identifying a comprehensive and reliable list of cancer driver genes. We provide a list of 291 high-confidence cancer driver genes acting on 3,205 tumors from 12 different cancer types. Among those genes, some have not been previously identified as cancer drivers and 16 have clear preference to sustain mutations in one specific tumor type. The novel driver candidates complement our current picture of the emergence of these diseases. In summary, the catalog of driver genes and the methodology presented here open new avenues to better understand the mechanisms of tumorigenesis.
[show abstract][hide abstract] ABSTRACT: A rapidly growing corpus of formal, computable pathway information can be used to answer important biological questions including finding non-trivial connections between cellular processes, identifying significantly altered portions of the cellular network in a disease state and building predictive models that can be used for precision medicine. Due to its complexity and fragmented nature, however, working with pathway data is still difficult. We present Paxtools, a Java library that contains algorithms, software components and converters for biological pathways represented in the standard BioPAX language. Paxtools allows scientists to focus on their scientific problem by removing technical barriers to access and analyse pathway information. Paxtools can run on any platform that has a Java Runtime Environment and was tested on most modern operating systems. Paxtools is open source and is available under the Lesser GNU public license (LGPL), which allows users to freely use the code in their software systems with a requirement for attribution. Source code for the current release (4.2.0) can be found in Software S1. A detailed manual for obtaining and using Paxtools can be found in Protocol S1. The latest sources and release bundles can be obtained from biopax.org/paxtools.
[show abstract][hide abstract] ABSTRACT: BioPAX is a community developed standard language for biological pathway data. A key functionality required for efficient BioPAX data exchange is validation - detecting errors and inconsistencies in BioPAX documents. The BioPAX Validator is a command line tool, Java library, and on-line web service for BioPAX that performs more than 100 classes of consistency checks.Availability and Implementation: The validator recognizes common syntactic errors and semantic inconsistencies and reports them in a customizable human readable format. It can also automatically fix some errors and normalize BioPAX data. Since its release, the validator has become a critical tool for the pathway informatics community, detecting thousands of errors and helping substantially increase the conformity and uniformity of BioPAX formatted data. The BioPAX Validator is open source and released under LGPL v3 license. All sources, binaries, and documentation can be found at sf.net/p/biopax, and the latest stable version of the web application is available at biopax.org/validator.
[show abstract][hide abstract] ABSTRACT: GeneMANIA (http://www.genemania.org) is a flexible user-friendly web interface for generating hypotheses about gene function, analyzing gene lists and prioritizing genes for functional assays. Given a query gene list, GeneMANIA extends the list with functionally similar genes that it identifies using available genomics and proteomics data. GeneMANIA also reports weights that indicate the predictive value of each selected data set for the query. GeneMANIA can also be used in a function prediction setting: given a query gene, GeneMANIA finds a small set of genes that are most likely to share function with that gene based on their interactions with it. Enriched Gene Ontology categories among this set can sometimes point to the function of the gene. Seven organisms are currently supported (Arabidopsis thaliana, Caenorhabditis elegans, Drosophila melanogaster, Mus musculus, Homo sapiens, Rattus norvegicus and Saccharomyces cerevisiae), and hundreds of data sets have been collected from GEO, BioGRID, IRefIndex and I2D, as well as organism-specific functional genomics data sets. Users can customize their search by selecting specific data sets to query and by uploading their own data sets to analyze.
Nucleic Acids Research 07/2013; 41(Web Server issue):W115-22. · 8.28 Impact Factor
[show abstract][hide abstract] ABSTRACT: Cytoscape is an open source software tool for biological network visualization and analysis, which can be extended with independently developed apps. We launched the Cytoscape App Store to highlight the important features that apps add to Cytoscape, enable researchers to find and install apps they need and help developers promote their apps. AVAILABILITY: The App Store is available at http://apps.cytoscape.org. CONTACT: email@example.com.
[show abstract][hide abstract] ABSTRACT: Src homology 3 (SH3) domains bind peptides to mediate protein-protein interactions that assemble and regulate dynamic biological processes. We surveyed the repertoire of SH3 binding specificity using peptide phage display in a metazoan, the worm Caenorhabditis elegans, and discovered that it structurally mirrors that of the budding yeast Saccharomyces cerevisiae. We then mapped the worm SH3 interactome using stringent yeast two-hybrid and compared it with the equivalent map for yeast. We found that the worm SH3 interactome resembles the analogous yeast network because it is significantly enriched for proteins with roles in endocytosis. Nevertheless, orthologous SH3 domain-mediated interactions are highly rewired. Our results suggest a model of network evolution where general function of the SH3 domain network is conserved over its specific form.
Molecular Systems Biology 04/2013; 9:652. · 11.34 Impact Factor
[show abstract][hide abstract] ABSTRACT: Large-scale cancer genome sequencing has uncovered thousands of gene mutations, but distinguishing tumor driver genes from functionally neutral passenger mutations is a major challenge. We analyzed 800 cancer genomes of eight types to find single-nucleotide variants (SNVs) that precisely target phosphorylation machinery, important in cancer development and drug targeting. Assuming that cancer-related biological systems involve unexpectedly frequent mutations, we used novel algorithms to identify genes with significant phosphorylation-associated SNVs (pSNVs), phospho-mutated pathways, kinase networks, drug targets, and clinically correlated signaling modules. We highlight increased survival of patients with TP53 pSNVs, hierarchically organized cancer kinase modules, a novel pSNV in EGFR, and an immune-related network of pSNVs that correlates with prolonged survival in ovarian cancer. Our findings include multiple actionable cancer gene candidates (FLNB, GRM1, POU2F1), protein complexes (HCF1, ASF1), and kinases (PRKCZ). This study demonstrates new ways of interpreting cancer genomes and presents new leads for cancer research.
Molecular Systems Biology 01/2013; 9:637. · 11.34 Impact Factor
[show abstract][hide abstract] ABSTRACT: BACKGROUND: PDZ domains are structural protein domains that recognize simple linear amino acid motifs, often at protein C-termini, and mediate protein-protein interactions (PPIs) in important biological processes, such as ion channel regulation, cell polarity and neural development. PDZ domain-peptide interaction predictors have been developed based on domain and peptide sequence information. Since domain structure is known to influence binding specificity, we hypothesized that structural information could be used to predict new interactions compared to sequence-based predictors. RESULTS: We developed a novel computational predictor of PDZ domain and C-terminal peptide interactions using a support vector machine trained with PDZ domain structure and peptide sequence information. Performance was estimated using extensive cross validation testing. We used the structure-based predictor to scan the human proteome for ligands of 218 PDZ domains and show that the predictions correspond to known PDZ domain-peptide interactions and PPIs in curated databases. The structure-based predictor is complementary to the sequence-based predictor, finding unique known and novel PPIs, and is less dependent on training--testing domain sequence similarity. We used a functional enrichment analysis of our hits to create a predicted map of PDZ domain biology. This map highlights PDZ domain involvement in diverse biological processes, some only found by the structure-based predictor. Based on this analysis, we predict novel PDZ domain involvement in xenobiotic metabolism and suggest new interactions for other processes including wound healing and Wnt signalling. CONCLUSIONS: We built a structure-based predictor of PDZ domain-peptide interactions, which can be used to scan C-terminal proteomes for PDZ interactions. We also show that the structure-based predictor finds many known PDZ mediated PPIs in human that were not found by our previous sequence-based predictor and is less dependent on training--testing domain sequence similarity. Using both predictors, we defined a functional map of human PDZ domain biology and predict novel PDZ domain function. Users may access our structure-based and previous sequence-based predictors at http://webservice.baderlab.org/domains/POW.
[show abstract][hide abstract] ABSTRACT: The characterization of the complex phenomenon of cell differentiation is a key goal of both systems and computational biology. GESTODIFFERENT is a Cytoscape plugin aimed at the generation and the identification of Gene Regulatory Networks (GRNs) describing an arbitrary stochastic cell differentiation process. The (dynamical) model adopted to describe general GRNs is that of Noisy Random Boolean Networks (NRBNs), with a specific focus on their emergent dynamical behavior. GESTODIFFERENT explores the space of GRNs by filtering the NRBN instances inconsistent with a stochastic lineage differentiation tree representing the cell lineages that can be obtained by following the fate of a stem cell descendant. Matched networks can then be analyzed by Cytoscape network analysis algorithms or, for instance, used to define (multiscale) models of cellular dynamics. AVAILABILITY: Freely available at http://bimib.disco.unimib.it/index.php/Retronet#GESTODifferent under a BDS-like license or at the Cytoscape App Store http://apps.cytoscape.org/. CONTACT: firstname.lastname@example.org.
[show abstract][hide abstract] ABSTRACT: Recently, we demonstrated that the anti-bacterial agent tigecycline preferentially induces death in leukemia cells through the inhibition of mitochondrial protein synthesis. Here, we sought to understand mechanisms of resistance to tigecycline by establishing a leukemia cell line resistant to the drug. TEX leukemia cells were treated with increasing concentrations of tigecycline over 4 months and a population of cells resistant to tigecycline (RTEX+TIG) was selected. Compared to wild type cells, RTEX+TIG cells had undetectable levels of mitochondrially translated proteins Cox-1 and Cox-2, reduced oxygen consumption and increased rates of glycolysis. Moreover, RTEX+TIG cells were more sensitive to inhibitors of glycolysis and more resistant to hypoxia. By electron microscopy, RTEX+TIG cells had abnormally swollen mitochondria with irregular cristae structures. RNA sequencing demonstrated a significant over-representation of genes with binding sites for the HIF1α:HIF1β transcription factor complex in their promoters. Upregulation of HIF1α mRNA and protein in RTEX+TIG cells was confirmed by Q-RTPCR and immunoblotting. Strikingly, upon removal of tigecycline from RTEX+TIG cells, the cells re-established aerobic metabolism. Levels of Cox-1 and Cox-2, oxygen consumption, glycolysis, mitochondrial mass and mitochondrial membrane potential returned to wild type levels, but HIF1α remained elevated. However, upon re-treatment with tigecycline for 72 hours, the glycolytic phenotype was re-established. Thus, we have generated cells with a reversible metabolic phenotype by chronic treatment with an inhibitor of mitochondrial protein synthesis. These cells will provide insight into cellular adaptations used to cope with metabolic stress.
PLoS ONE 01/2013; 8(3):e58367. · 3.73 Impact Factor
[show abstract][hide abstract] ABSTRACT: Somatic mutations in cancer genomes include drivers that provide selective advantages to tumor cells and passengers present due to genome instability. Discovery of pan-cancer drivers will help characterize biological systems important in multiple cancers and lead to development of better therapies. Driver genes are most often identified by their recurrent mutations across tumor samples. However, some mutations are more important for protein function than others. Thus considering the location of mutations with respect to functional protein sites can predict their mechanisms of action and improve the sensitivity of driver gene detection. Protein phosphorylation is a post-translational modification central to cancer biology and treatment, and frequently altered by driver mutations. Here we used our ActiveDriver method to analyze known phosphorylation sites mutated by single nucleotide variants (SNVs) in The Cancer Genome Atlas Research Network (TCGA) pan-cancer dataset of 3,185 genomes and 12 cancer types. Phosphorylation-related SNVs (pSNVs) occur in ~90% of tumors, show increased conservation and functional mutation impact compared to other protein-coding mutations, and are enriched in cancer genes and pathways. Gene-centric analysis found 150 known and candidate cancer genes with significant pSNV recurrence. Using a novel computational method, we predict that 29% of these mutations directly abolish phosphorylation or modify kinase target sites to rewire signaling pathways. This analysis shows that incorporation of information about protein signaling sites will improve computational pipelines for variant function prediction.
[show abstract][hide abstract] ABSTRACT: The biology and disease oriented branch of the Human Proteome Project (B/D-HPP) was established by the Human Proteome Organization (HUPO) with the main goal of supporting the broad application of state-of the-art measurements of proteins and proteomes by life scientists studying the molecular mechanisms of biological processes and human disease. This will be accomplished through the generation of research and informational resources that will support the routine and definitive measurement of the process or disease relevant proteins. The B/D-HPP is highly complementary to the C-HPP and will provide datasets and biological characterization useful to the C-HPP teams. In this manuscript we describe the goals, the plans, and the current status of the of the B/D-HPP.
Journal of Proteome Research 12/2012; · 5.06 Impact Factor
[show abstract][hide abstract] ABSTRACT: Lifelong blood cell production is governed through the poorly understood integration of cell-intrinsic and -extrinsic control of hematopoietic stem cell (HSC) quiescence and activation. MicroRNAs (miRNAs) coordinately regulate multiple targets within signaling networks, making them attractive candidate HSC regulators. We report that miR-126, a miRNA expressed in HSC and early progenitors, plays a pivotal role in restraining cell-cycle progression of HSC in vitro and in vivo. miR-126 knockdown by using lentiviral sponges increased HSC proliferation without inducing exhaustion, resulting in expansion of mouse and human long-term repopulating HSC. Conversely, enforced miR-126 expression impaired cell-cycle entry, leading to progressively reduced hematopoietic contribution. In HSC/early progenitors, miR-126 regulates multiple targets within the PI3K/AKT/GSK3β pathway, attenuating signal transduction in response to extrinsic signals. These data establish that miR-126 sets a threshold for HSC activation and thus governs HSC pool size, demonstrating the importance of miRNA in the control of HSC function.
[show abstract][hide abstract] ABSTRACT: Cytoscape is open-source software for integration, visualization and analysis of biological networks. It can be extended through Cytoscape plugins, enabling a broad community of scientists to contribute useful features. This growth has occurred organically through the independent efforts of diverse authors, yielding a powerful but heterogeneous set of tools. We present a travel guide to the world of plugins, covering the 152 publicly available plugins for Cytoscape 2.5-2.8. We also describe ongoing efforts to distribute, organize and maintain the quality of the collection.
[show abstract][hide abstract] ABSTRACT: BACKGROUND: The use of biological molecular network information for diagnostic and prognostic purposes and elucidation of molecular disease mechanism is a key objective in systems biomedicine. The network of regulatory miRNA-target and functional protein interactions is a rich source of information to elucidate the function and the prognostic value of miRNAs in cancer. The objective of this study is to identify miRNAs that have high influence on target protein complexes in prostate cancer as a case study. This could provide biomarkers or therapeutic targets relevant for prostate cancer treatment. RESULTS: Our findings demonstrate that a miRNA's functional role can be explained by its target protein connectivity within a physical and functional interaction network. To detect miRNAs with high influence on target protein modules, we integrated miRNA and mRNA expression profiles with a sequence based miRNA-target network and human functional and physical protein interactions (FPI). miRNAs with high influence on target protein complexes play a role in prostate cancer progression and are promising diagnostic or prognostic biomarkers. We uncovered several miRNA-regulated protein modules which were enriched in focal adhesion and prostate cancer genes. Several miRNAs such as miR-96, miR-182, and miR-143 demonstrated high influence on their target protein complexes and could explain most of the gene expression changes in our analyzed prostate cancer data set. CONCLUSIONS: We describe a novel method to identify active miRNA-target modules relevant to prostate cancer progression and outcome. miRNAs with high influence on protein networks are valuable biomarkers that can be used in clinical investigations for prostate cancer treatment.
BMC Systems Biology 08/2012; 6(1):112. · 2.98 Impact Factor
[show abstract][hide abstract] ABSTRACT: Medulloblastoma, the most common malignant paediatric brain tumour, is currently treated with nonspecific cytotoxic therapies including surgery, whole-brain radiation, and aggressive chemotherapy. As medulloblastoma exhibits marked intertumoural heterogeneity, with at least four distinct molecular variants, previous attempts to identify targets for therapy have been underpowered because of small samples sizes. Here we report somatic copy number aberrations (SCNAs) in 1,087 unique medulloblastomas. SCNAs are common in medulloblastoma, and are predominantly subgroup-enriched. The most common region of focal copy number gain is a tandem duplication of SNCAIP, a gene associated with Parkinson's disease, which is exquisitely restricted to Group 4α. Recurrent translocations of PVT1, including PVT1-MYC and PVT1-NDRG1, that arise through chromothripsis are restricted to Group 3. Numerous targetable SCNAs, including recurrent events targeting TGF-β signalling in Group 3, and NF-κB signalling in Group 4, suggest future avenues for rational, targeted therapy.
[show abstract][hide abstract] ABSTRACT: Many long-lived species of animals require the function of adult stem cells throughout their lives. However, the transcriptomes of stem cells in invertebrates and vertebrates have not been compared, and consequently, ancestral regulatory circuits that control stem cell populations remain poorly defined. In this study, we have used data from high-throughput RNA sequencing to compare the transcriptomes of pluripotent adult stem cells from planarians with the transcriptomes of human and mouse pluripotent embryonic stem cells. From a stringently defined set of 4,432 orthologs shared between planarians, mice and humans, we identified 123 conserved genes that are ≥5-fold differentially expressed in stem cells from all three species. Guided by this gene set, we used RNAi screening in adult planarians to discover novel stem cell regulators, which we found to affect the stem cell-associated functions of tissue homeostasis, regeneration, and stem cell maintenance. Examples of genes that disrupted these processes included the orthologs of TBL3, PSD12, TTC27, and RACK1. From these analyses, we concluded that by comparing stem cell transcriptomes from diverse species, it is possible to uncover conserved factors that function in stem cell biology. These results provide insights into which genes comprised the ancestral circuitry underlying the control of stem cell self-renewal and pluripotency.