Critical amino acid residues in proteins: a BioMart integration of Reactome protein annotations with PRIDE mass spectrometry data and COSMIC somatic mutations

European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK.
Database The Journal of Biological Databases and Curation (Impact Factor: 4.46). 01/2011; 2011:bar047. DOI: 10.1093/database/bar047
Source: PubMed

ABSTRACT The reversible phosphorylation of serine, threonine and tyrosine hydroxyl groups is an especially prominent form of post-translational modification (PTM) of proteins. It plays critical roles in the regulation of diverse processes, and mutations that directly or indirectly affect these phosphorylation events have been associated with many cancers and other pathologies. Here, we describe the development of a new BioMart tool that gathers data from three different biological resources to provide the user with an integrated view of phosphorylation events associated with a human protein of interest, the complexes of which the protein (modified or not) is a part, the reactions in which the protein and its complexes participate and the somatic mutations that might be expected to perturb those functions. The three resources used are the Reactome, PRIDE and COSMIC databases. The Reactome knowledgebase contains annotations of phosphorylated human proteins linked to the reactions in which they are phosphorylated and dephosphorylated, to the complexes of which they are parts and to the reactions in which the phosphorylated proteins participate as substrates, catalysts and regulators. The PRIDE database holds extensive mass spectrometry data from which protein phosphorylation patterns can be inferred, and the COSMIC database holds records of somatic mutations found in human cancer cells. This tool supports both flexible, user-specified queries and standard ('canned') queries to retrieve frequently used combinations of data for user-specified proteins and reactions. We demonstrate using the Wnt signaling pathway and the human c-SRC protein how the tool can be used to place somatic mutation data into a functional perspective by changing critical residues involved in pathway modulation, and where available, check for mass spectrometry evidence in PRIDE supporting identification of the critical residue.

Download full-text


Available from: David Croft, Jun 21, 2015
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The BioMart Community Portal ( is a community-driven effort to provide a unified interface to biomedical databases that are distributed worldwide. The portal provides access to numerous database projects supported by 30 scientific organizations. It includes over 800 different biological datasets spanning genomics, proteomics, model organisms, cancer data, ontology information and more. All resources available through the portal are independently administered and funded by their host organizations. The BioMart data federation technology provides a unified interface to all the available data. The latest version of the portal comes with many new databases that have been created by our ever-growing community. It also comes with better support and extensibility for data analysis and visualization tools. A new addition to our toolbox, the enrichment analysis tool is now accessible through graphical and web service interface. The BioMart community portal averages over one million requests per day. Building on this level of service and the wealth of information that has become available, the BioMart Community Portal has introduced a new, more scalable and cheaper alternative to the large data stores maintained by specialized organizations. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
    Nucleic Acids Research 04/2015; DOI:10.1093/nar/gkv350 · 8.81 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The PRoteomics IDEntifications (PRIDE, database at the European Bioinformatics Institute is one of the most prominent data repositories of mass spectrometry (MS)-based proteomics data. Here, we summarize recent developments in the PRIDE database and related tools. First, we provide up-to-date statistics in data content, splitting the figures by groups of organisms and species, including peptide and protein identifications, and post-translational modifications. We then describe the tools that are part of the PRIDE submission pipeline, especially the recently developed PRIDE Converter 2 (new submission tool) and PRIDE Inspector (visualization and analysis tool). We also give an update about the integration of PRIDE with other MS proteomics resources in the context of the ProteomeXchange consortium. Finally, we briefly review the quality control efforts that are ongoing at present and outline our future plans.
    Nucleic Acids Research 11/2012; 41(Database issue). DOI:10.1093/nar/gks1262 · 8.81 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The 19th annual Database Issue of Nucleic Acids Research features descriptions of 92 new online databases covering various areas of molecular biology and 100 papers describing recent updates to the databases previously described in NAR and other journals. The highlights of this issue include, among others, a description of neXtProt, a knowledgebase on human proteins; a detailed explanation of the principles behind the NCBI Taxonomy Database; NCBI and EBI papers on the recently launched BioSample databases that store sample information for a variety of database resources; descriptions of the recent developments in the Gene Ontology and UniProt Gene Ontology Annotation projects; updates on Pfam, SMART and InterPro domain databases; update papers on KEGG and TAIR, two universally acclaimed databases that face an uncertain future; and a separate section with 10 wiki-based databases, introduced in an accompanying editorial. The NAR online Molecular Biology Database Collection, available at, has been updated and now lists 1380 databases. Brief machine-readable descriptions of the databases featured in this issue, according to the BioDBcore standards, will be provided at the web site. The full content of the Database Issue is freely available online on the Nucleic Acids Research web site (
    Nucleic Acids Research 12/2011; 40(Database issue):D1-8. DOI:10.1093/nar/gkr1196 · 8.81 Impact Factor

Similar Publications