Critical amino acid residues in proteins: A BioMart integration of Reactome protein annotations with PRIDE mass spectrometry data and COSMIC somatic mutations

European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK.
Database The Journal of Biological Databases and Curation (Impact Factor: 3.37). 01/2011; 2011:bar047. DOI: 10.1093/database/bar047
Source: PubMed


The reversible phosphorylation of serine, threonine and tyrosine hydroxyl groups is an especially prominent form of post-translational modification (PTM) of proteins. It plays critical roles in the regulation of diverse processes, and mutations that directly or indirectly affect these phosphorylation events have been associated with many cancers and other pathologies. Here, we describe the development of a new BioMart tool that gathers data from three different biological resources to provide the user with an integrated view of phosphorylation events associated with a human protein of interest, the complexes of which the protein (modified or not) is a part, the reactions in which the protein and its complexes participate and the somatic mutations that might be expected to perturb those functions. The three resources used are the Reactome, PRIDE and COSMIC databases. The Reactome knowledgebase contains annotations of phosphorylated human proteins linked to the reactions in which they are phosphorylated and dephosphorylated, to the complexes of which they are parts and to the reactions in which the phosphorylated proteins participate as substrates, catalysts and regulators. The PRIDE database holds extensive mass spectrometry data from which protein phosphorylation patterns can be inferred, and the COSMIC database holds records of somatic mutations found in human cancer cells. This tool supports both flexible, user-specified queries and standard (‘canned’) queries to retrieve frequently used combinations of data for user-specified proteins and reactions. We demonstrate using the Wnt signaling pathway and the human c-SRC protein how the tool can be used to place somatic mutation data into a functional perspective by changing critical residues involved in pathway modulation, and where available, check for mass spectrometry evidence in PRIDE supporting identification of the critical residue.
Database URL:

Download full-text


Available from: David Croft
  • Source
    • "The BioMart interface is useful for batch data retrieval (25). In the current version of the PRIDE BioMart (running on BioMart version 0.7), data integration with Reactome (26) has been extended (27), by enabling the link between phosphorylated proteins present in Reactome pathways and phosphorylated proteins detected by MS approaches stored in PRIDE. The PRIDE BioMart data can also be accessed using a Representational State Transfer web service, which is heavily used. "
    [Show abstract] [Hide abstract]
    ABSTRACT: The PRoteomics IDEntifications (PRIDE, database at the European Bioinformatics Institute is one of the most prominent data repositories of mass spectrometry (MS)-based proteomics data. Here, we summarize recent developments in the PRIDE database and related tools. First, we provide up-to-date statistics in data content, splitting the figures by groups of organisms and species, including peptide and protein identifications, and post-translational modifications. We then describe the tools that are part of the PRIDE submission pipeline, especially the recently developed PRIDE Converter 2 (new submission tool) and PRIDE Inspector (visualization and analysis tool). We also give an update about the integration of PRIDE with other MS proteomics resources in the context of the ProteomeXchange consortium. Finally, we briefly review the quality control efforts that are ongoing at present and outline our future plans.
    Full-text · Article · Nov 2012 · Nucleic Acids Research
  • Source
    • "Obviously, UniProtKB provides a plethora of links to all kinds of databases, including ENA, GenBank, DDBJ, RefSeq, PDBe, PDBj, IntAct, MINT, Ensembl, KEGG, UCSC Genome Browser, neXtProt, SGD, FlyBase, WormBase, MGD, TAIR, eggNOG, MetaCyc, InterPro, Gene3D, Pfam, SMART and ProtoNet, which are featured in this issue. However, many database interactions are more subtle: for example, BioMart has been recently used to link protein annotation data from the Reactome database of metabolic networks (41) to phosphoproteomics data in PRIDE (30) and somatic mutations in COSMIC (42), which allowed putting cancer-related mutation data into a functional context (43). "
    [Show abstract] [Hide abstract]
    ABSTRACT: The 2015 Nucleic Acids Research Database Issue contains 172 papers that include descriptions of 56 new molecular biology databases, and updates on 115 databases whose descriptions have been previously published in NAR or other journals. Following the classification that has been introduced last year in order to simplify navigation of the entire issue, these articles are divided into eight subject categories. This year's highlights include RNAcentral, an international community portal to various databases on noncoding RNA; ValidatorDB, a validation database for protein structures and their ligands; SASBDB, a primary repository for small-angle scattering data of various macromolecular complexes; MoonProt, a database of ‘moonlighting’ proteins, and two new databases of protein–protein and other macromolecular complexes, ComPPI and the Complex Portal. This issue also includes an unusually high number of cancer-related databases and other databases dedicated to genomic basics of disease and potential drugs and drug targets. The size of NAR online Molecular Biology Database Collection,, remained approximately the same, following the addition of 74 new resources and removal of 77 obsolete web sites. The entire Database Issue is freely available online on the Nucleic Acids Research web site (
    Full-text · Article · Dec 2011 · Nucleic Acids Research
  • Source

    Full-text · Article · Jan 2011 · Database The Journal of Biological Databases and Curation
Show more