Improvements in the Protein Identifier Cross-Reference service.

EMBL Outstation, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK.
Nucleic Acids Research (Impact Factor: 8.81). 04/2012; 40(Web Server issue):W276-80. DOI: 10.1093/nar/gks338
Source: PubMed

ABSTRACT The Protein Identifier Cross-Reference (PICR) service is a tool that allows users to map protein identifiers, protein sequences and gene identifiers across over 100 different source databases. PICR takes input through an interactive website as well as Representational State Transfer (REST) and Simple Object Access Protocol (SOAP) services. It returns the results as HTML pages, XLS and CSV files. It has been in production since 2007 and has been recently enhanced to add new functionality and increase the number of databases it covers. Protein subsequences can be Basic Local Alignment Search Tool (BLAST) against the UniProt Knowledgebase (UniProtKB) to provide an entry point to the standard PICR mapping algorithm. In addition, gene identifiers from UniProtKB and Ensembl can now be submitted as input or mapped to as output from PICR. We have also implemented a 'best-guess' mapping algorithm for UniProt. In this article, we describe the usefulness of PICR, how these changes have been implemented, and the corresponding additions to the web services. Finally, we explain that the number of source databases covered by PICR has increased from the initial 73 to the current 102. New resources include several new species-specific Ensembl databases as well as the Ensembl Genome ones. PICR can be accessed at

  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The human brain is exceedingly complex, constituted by billions of neurons and trillions of synaptic connections which, in turn, define ∼ 900 neuroanatomically subdivisions in the adult brain (Hawrylycz MJ et al. An anatomically comprehensive atlas of the human brain transcriptome. Nature 2012, 489, 391-399). The human brain transcriptome has revealed specific regional transcriptional signatures that are regulated in a spatio-temporal manner, increasing the complexity of the structural and molecular organization of this organ (Kang HJ et al. Spatio-temporal transcriptome of the human brain. Nature 2011, 478, 483-489). During the last decade, neuroproteomics has emerged as a powerful approach to profile neural proteomes using shotgun-based mass spectrometry (MS), providing complementary information about protein content and function at a global level. Here, we revise recent proteome profiling studies performed in human brain, with special emphasis on proteome mapping of anatomical macrostructures, specific-subcellular compartments, and cerebrospinal fluid (CSF). Moreover, we have performed an integrative functional analysis of the protein compilation derived from these large-scale human brain proteomic studies in order to obtain a comprehensive view of human brain biology. Finally, we also discuss the potential contribution of our meta-analysis to the Chromosome-centric Human Proteome Project (C-HPP) initiative. This article is protected by copyright. All rights reserved. This article is protected by copyright. All rights reserved.
    Proteomics. Clinical applications 02/2015; DOI:10.1002/prca.201400127 · 2.68 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: ProteomeScout ( is a resource for the study of proteins and their post-translational modifications (PTMs) consisting of a database of PTMs, a repository for experimental data, an analysis suite for PTM experiments, and a tool for visualizing the relationships between complex protein annotations. The PTM database is a compendium of public PTM data, coupled with user-uploaded experimental data. ProteomeScout provides analysis tools for experimental datasets, including summary views and subset selection, which can identify relationships within subsets of data by testing for statistically significant enrichment of protein annotations. Protein annotations are incorporated in the ProteomeScout database from external resources and include terms such as Gene Ontology annotations, domains, secondary structure and non-synonymous polymorphisms. These annotations are available in the database download, in the analysis tools and in the protein viewer. The protein viewer allows for the simultaneous visualization of annotations in an interactive web graphic, which can be exported in Scalable Vector Graphics (SVG) format. Finally, quantitative data measurements associated with public experiments are also easily viewable within protein records, allowing researchers to see how PTMs change across different contexts. ProteomeScout should prove useful for protein researchers and should benefit the proteomics community by providing a stable repository for PTM experiments. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
    Nucleic Acids Research 11/2014; 43(D1). DOI:10.1093/nar/gku1154 · 8.81 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: On the Semantic Web, in life sciences in particular, data is often distributed via multiple resources. Each of these sources is likely to use their own IRI (International Resource Identifier) for conceptually the same resource or database record. The lack of correspondence between identifiers introduces a barrier when executing federated SPARQL queries across life science data. We introduce a novel SPARQL-based service to enable on-the-fly integration of life science data. This service uses the identifier patterns defined in the Registry to generate a plurality of identifier variants, which can then be used to match source identifiers with target identifiers. We demonstrate the utility of this identifier integration approach by answering queries across major producers of life science Linked Data. Availability: The SPARQL-based identifier conversion service is available without restriction at © The Author(s) 2015. Published by Oxford University Press.
    Bioinformatics 01/2015; DOI:10.1093/bioinformatics/btv064 · 4.62 Impact Factor