RNAcentral: A vision for an international database of RNA sequences

Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, CB10 1SA, United Kingdom.
RNA (Impact Factor: 4.94). 10/2011; 17(11):1941-1946. DOI: 10.1261/rna.2750811
Source: PubMed


During the last decade there has been a great increase in the number of noncoding RNA genes identified, including new classes such as microRNAs and piRNAs. There is also a large growth in the amount of experimental characterization of these RNA components. Despite this growth in information, it is still difficult for researchers to access RNA data, because key data resources for noncoding RNAs have not yet been created. The most pressing omission is the lack of a comprehensive RNA sequence database, much like UniProt, which provides a comprehensive set of protein knowledge. In this article we propose the creation of a new open public resource that we term RNAcentral, which will contain a comprehensive collection of RNA sequences and fill an important gap in the provision of biomedical databases. We envision RNA researchers from all over the world joining a federated RNAcentral network, contributing specialized knowledge and databases. RNAcentral would centralize key data that are currently held across a variety of databases, allowing researchers instant access to a single, unified resource. This resource would facilitate the next generation of RNA research and help drive further discoveries, including those that improve food production and human and animal health. We encourage additional RNA database resources and research groups to join this effort. We aim to obtain international network funding to further this endeavor.

30 Reads
  • Source
    • "As the GENCODE project is expanding to mouse to improve its reference annotation, the number of lncRNAs and pseudogenes annotated will increase within VEGA. We have also begun to submit the human lncRNAs annotated within VEGA to the Third Party Annotation database (30) to enable submission to the newly formed federated database RNAcentral (31). This will allow more users to access this highly curated data and allow for it to be integrated into a more comprehensive RNA database. "
    [Show abstract] [Hide abstract]
    ABSTRACT: The Vertebrate Genome Annotation (VEGA) database (, initially designed as a community resource for browsing manual annotation of the human genome project, now contains five reference genomes (human, mouse, zebrafish, pig and rat). Its introduction pages have been redesigned to enable the user to easily navigate between whole genomes and smaller multi-species haplotypic regions of interest such as the major histocompatibility complex. The VEGA browser is unique in that annotation is updated via the Human And Vertebrate Analysis aNd Annotation (HAVANA) update track every 2 weeks, allowing single gene updates to be made publicly available to the research community quickly. The user can now access different haplotypic subregions more easily, such as those from the non-obese diabetic mouse, and display them in a more intuitive way using the comparative tools. We also highlight how the user can browse manually annotated updated patches from the Genome Reference Consortium (GRC).
    Nucleic Acids Research 12/2013; 42(Database issue). DOI:10.1093/nar/gkt1241 · 9.11 Impact Factor
  • Source
    • "In each case, web services enable key information to be downloaded from an appropriate reference database and stored internally within IntAct, enabling curators to annotate interaction data relevant to those entities. In the near future, the increasing amount of RNA-based interaction data will also present new challenges to the molecular interaction curation community and the development of reference resources such as RNAcentral (24) will be critical to the capture of these important interactomes. "
    [Show abstract] [Hide abstract]
    ABSTRACT: IntAct (freely available at is an open-source, open data molecular interaction database populated by data either curated from the literature or from direct data depositions. IntAct has developed a sophisticated web-based curation tool, capable of supporting both IMEx- and MIMIx-level curation. This tool is now utilized by multiple additional curation teams, all of whom annotate data directly into the IntAct database. Members of the IntAct team supply appropriate levels of training, perform quality control on entries and take responsibility for long-term data maintenance. Recently, the MINT and IntAct databases decided to merge their separate efforts to make optimal use of limited developer resources and maximize the curation output. All data manually curated by the MINT curators have been moved into the IntAct database at EMBL-EBI and are merged with the existing IntAct dataset. Both IntAct and MINT are active contributors to the IMEx consortium (
    Nucleic Acids Research 11/2013; 42(Database issue). DOI:10.1093/nar/gkt1115 · 9.11 Impact Factor
  • Source
    • "In the future, we plan to integrate the databases on different aspects of RNA metabolism using a common data model and a joint interface. Another envisaged direction of development is to link generic RNA representations in RNApathwaysDB to particular mature RNA sequences that will be stored in the future RNAcentral database (31) and to prokaryotic and eukaryotic genome sequence databases, so e.g. cleavage sites in maturation reactions could be mapped onto the precursor RNA molecules. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Many RNA molecules undergo complex maturation, involving e.g. excision from primary transcripts, removal of introns, post-transcriptional modification and polyadenylation. The level of mature, functional RNAs in the cell is controlled not only by the synthesis and maturation but also by degradation, which proceeds via many different routes. The systematization of data about RNA metabolic pathways and enzymes taking part in RNA maturation and degradation is essential for the full understanding of these processes. RNApathwaysDB, available online at, is an online resource about maturation and decay pathways involving RNA as the substrate. The current release presents information about reactions and enzymes that take part in the maturation and degradation of tRNA, rRNA and mRNA, and describes pathways in three model organisms: Escherichia coli, Saccharomyces cerevisiae and Homo sapiens. RNApathwaysDB can be queried with keywords, and sequences of protein enzymes involved in RNA processing can be searched with BLAST. Options for data presentation include pathway graphs and tables with enzymes and literature data. Structures of macromolecular complexes involving RNA and proteins that act on it are presented as 'potato models' using DrawBioPath-a new javascript tool.
    Nucleic Acids Research 11/2012; 41(Database issue). DOI:10.1093/nar/gks1052 · 9.11 Impact Factor
Show more


30 Reads
Available from
May 20, 2014