Database The Journal of Biological Databases and Curation

Publisher: Oxford Journals (Firm), Oxford University Press (OUP)

Journal description

Current impact factor: 3.37

Impact Factor Rankings

2015 Impact Factor Available summer 2016
2014 Impact Factor 3.372
2013 Impact Factor 4.457
2012 Impact Factor 4.2
2011 Impact Factor 2.071

Impact factor over time

Impact factor

Additional details

5-year impact 4.51
Cited half-life 3.20
Immediacy index 0.61
Eigenfactor 0.01
Article influence 1.68
Other titles Journal of biological databases and curation
ISSN 1758-0463
OCLC 319891682
Material type Document, Periodical, Internet resource
Document type Internet Resource, Computer File, Journal / Magazine / Newspaper

Publisher details

Oxford University Press (OUP)

  • Pre-print
    • Author can archive a pre-print version
  • Post-print
    • Author cannot archive a post-print version
  • Conditions
    • Creative Commons Attribution License
    • Pre-print on author's personal website, employers website or subject repository
    • Pre-print can only be posted prior to acceptance
    • Pre-print must be accompanied by set statement (see link)
    • Pre-print must not be replaced with post-print, instead a link to published version with amended set statement should be made
    • Post-print in Institutional repositories or Central repositories
    • Publisher's version/PDF must be used
    • Publisher's version/PDF on institutional repository or centrally organised repositories
    • Published source must be acknowledged
    • Must link to publisher version
    • Set phrase to accompany archived copy (see policy)
    • Eligible authors may deposit in OpenDepot
    • Publisher will deposit on behalf of NIH, HHMI, UK MRC, Telethon and Wellcome Trust funded authors to PubMed Central and Europe PMC
    • All titles are open access journals
    • Progress of Theoretical and Experimental Physics is a participant in SCOAP3
    • This policy is an exception to the default policies of 'Oxford University Press (OUP)'
  • Classification
    ​ green

Publications in this journal

  • [Show abstract] [Hide abstract]
    ABSTRACT: Compared with animal microRNAs (miRNAs), our limited knowledge of how miRNAs involve in significant biological processes in plants is still unclear. AtmiRNET is a novel resource geared toward plant scientists for reconstructing regulatory networks of Arabidopsis miRNAs. By means of highlighted miRNA studies in target recognition, functional enrichment of target genes, promoter identification and detection of cis- and trans-elements, AtmiRNET allows users to explore mechanisms of transcriptional regulation and miRNA functions in Arabidopsis thaliana, which are rarely investigated so far. High-throughput next-generation sequencing datasets from transcriptional start sites (TSSs)-relevant experiments as well as five core promoter elements were collected to establish the support vector machine-based prediction model for Arabidopsis miRNA TSSs. Then, high-confidence transcription factors participate in transcriptional regulation of Arabidopsis miRNAs are provided based on statistical approach. Furthermore, both experimentally verified and putative miRNA-target interactions, whose validity was supported by the correlations between the expression levels of miRNAs and their targets, are elucidated for functional enrichment analysis. The inferred regulatory networks give users an intuitive insight into the pivotal roles of Arabidopsis miRNAs through the crosstalk between miRNA transcriptional regulation (upstream) and miRNA-mediate (downstream) gene circuits. The valuable information that is visually oriented in AtmiRNET recruits the scant understanding of plant miRNAs and will be useful (e.g. ABA-miR167c-auxin signaling pathway) for further research. Database URL: © The Author(s) 2015. Published by Oxford University Press.
    Database The Journal of Biological Databases and Curation 01/2015; 2015. DOI:10.1093/database/bav042
  • [Show abstract] [Hide abstract]
    ABSTRACT: Spermatogenic failure is a major cause of male infertility, which affects millions of couples worldwide. Recent discovery of long non-coding RNAs (lncRNAs) as critical regulators in normal and disease development provides new clues for delineating the molecular regulation in male germ cell development. However, few functional lncRNAs have been characterized to date. A major limitation in studying lncRNA in male germ cell development is the absence of germ cell-specific lncRNA annotation. Current lncRNA annotations are assembled by transcriptome data from heterogeneous tissue sources; specific germ cell transcript information of various developmental stages is therefore under-represented, which may lead to biased prediction or fail to identity important germ cell-specific lncRNAs. GermlncRNA provides the first comprehensive web-based and open-access lncRNA catalogue for three key male germ cell stages, including type A spermatogonia, pachytene spermatocytes and round spermatids. This information has been developed by integrating male germ transcriptome resources derived from RNA-Seq, tiling microarray and GermSAGE. Characterizations on lncRNA-associated regulatory features, potential coding gene and microRNA targets are also provided. Search results from GermlncRNA can be exported to Galaxy for downstream analysis or downloaded locally. Taken together, GermlncRNA offers a new avenue to better understand the role of lncRNAs and associated targets during spermatogenesis. Database URL:
    Database The Journal of Biological Databases and Curation 01/2015; 2015:bav044-bav044. DOI:10.1093/database/bav044
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: BioXpress is a gene expression and cancer association database in which the expression levels are mapped to genes using RNA-seq data obtained from The Cancer Genome Atlas, International Cancer Genome Consortium, Expression Atlas and publications. The BioXpress database includes expression data from 64 cancer types, 6361 patients and 17 469 genes with 9513 of the genes displaying differential expression between tumor and normal samples. In addition to data directly retrieved from RNA-seq data repositories, manual biocuration of publications supplements the available cancer association annotations in the database. All cancer types are mapped to Disease Ontology terms to facilitate a uniform pan-cancer analysis. The BioXpress database is easily searched using HUGO Gene Nomenclature Committee gene symbol, UniProtKB/RefSeq accession or, alternatively, can be queried by cancer type with specified significance filters. This interface along with availability of pre-computed downloadable files containing differentially expressed genes in multiple cancers enables straightforward retrieval and display of a broad set of cancer-related genes. Database URL: © The Author(s) 2015. Published by Oxford University Press.
    Database The Journal of Biological Databases and Curation 01/2015; 2015. DOI:10.1093/database/bav019
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Functional annotation of genetic variants including single nucleotide polymorphisms (SNPs) and copy number variations (CNV) promises to greatly improve our understanding of human complex traits. Previous transcriptomic studies involving individuals from different global populations have investigated the genetic architecture of gene expression variation by mapping expression quantitative trait loci (eQTL). Functional interpretation of genome-wide association studies (GWAS) has identified enrichment of eQTL in top signals from GWAS of human complex traits. The SCAN (SNP and CNV Annotation) database was developed as a web-based resource of genetical genomic studies including eQTL detected in the HapMap lymphoblastoid cell line samples derived from apparently healthy individuals of European and African ancestry. Considering the critical roles of epigenetic gene regulation, cytosine modification quantitative trait loci (mQTL) are expected to add a crucial layer of annotation to existing functional genomic information. Here, we describe the new features of the SCAN database that integrate comprehensive mQTL mapping results generated in the HapMap CEU (Caucasian residents from Utah, USA) and YRI (Yoruba people from Ibadan, Nigeria) LCL samples and demonstrate the utility of the enhanced functional annotation system. Database URL: © The Author(s) 2015. Published by Oxford University Press.
    Database The Journal of Biological Databases and Curation 01/2015; 2015. DOI:10.1093/database/bav025
  • [Show abstract] [Hide abstract]
    ABSTRACT: Bio-ontologies provide terminologies for the scientific community to describe biomedical entities in a standardized manner. There are multiple initiatives that are developing biomedical terminologies for the purpose of providing better annotation, data integration and mining capabilities. Terminology resources devised for multiple purposes inherently diverge in content and structure. A major issue of biomedical data integration is the development of overlapping terms, ambiguous classifications and inconsistencies represented across databases and publications. The disease ontology (DO) was developed over the past decade to address data integration, standardization and annotation issues for human disease data. We have established a DO cancer project to be a focused view of cancer terms within the DO. The DO cancer project mapped 386 cancer terms from the Catalogue of Somatic Mutations in Cancer (COSMIC), The Cancer Genome Atlas (TCGA), International Cancer Genome Consortium, Therapeutically Applicable Research to Generate Effective Treatments, Integrative Oncogenomics and the Early Detection Research Network into a cohesive set of 187 DO terms represented by 63 top-level DO cancer terms. For example, the COSMIC term 'kidney, NS, carcinoma, clear_cell_renal_cell_carcinoma' and TCGA term 'Kidney renal clear cell carcinoma' were both grouped to the term 'Disease Ontology Identification (DOID):4467 / renal clear cell carcinoma' which was mapped to the TopNodes_DOcancerslim term 'DOID:263 / kidney cancer'. Mapping of diverse cancer terms to DO and the use of top level terms (DO slims) will enable pan-cancer analysis across datasets generated from any of the cancer term sources where pan-cancer means including or relating to all or multiple types of cancer. The terms can be browsed from the DO web site ( and downloaded from the DO's Apache Subversion or GitHub repositories. Database URL:
    Database The Journal of Biological Databases and Curation 01/2015; 2015:bav032-bav032. DOI:10.1093/database/bav032
  • [Show abstract] [Hide abstract]
    ABSTRACT: Tandem duplication is a wide-spread phenomenon in plant genomes and plays significant roles in evolution and adaptation to changing environments. Tandem duplicated genes related to certain functions will lead to the expansion of gene families and bring increase of gene dosage in the form of gene cluster arrays. Many tandem duplication events have been studied in plant genomes; yet, there is a surprising shortage of efforts to systematically present the integration of large amounts of information about publicly deposited tandem duplicated gene data across the plant kingdom. To address this shortcoming, we developed the first plant tandem duplicated genes database, PTGBase. It delivers the most comprehensive resource available to date, spanning 39 plant genomes, including model species and newly sequenced species alike. Across these genomes, 54 130 tandem duplicated gene clusters (129 652 genes) are presented in the database. Each tandem array, as well as its member genes, is characterized in complete detail. Tandem duplicated genes in PTGBase can be explored through browsing or searching by identifiers or keywords of functional annotation and sequence similarity. Users can download tandem duplicated gene arrays easily to any scale, up to the complete annotation data set for an entire plant genome. PTGBase will be updated regularly with newly sequenced plant species as they become available. Database URL:
    Database The Journal of Biological Databases and Curation 01/2015; 2015:bav017-bav017. DOI:10.1093/database/bav017
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Interaction network surrounding telomeres has been intensively studied during the past two decades. However, no specific resource by integrating telomere interaction information data is currently available. To facilitate the understanding of the molecular interaction network by which telomeres are associated with biological process and diseases, we have developed TeloPIN (Telomeric Proteins Interaction Network) database (, a novel database that points to provide comprehensive information on protein-protein, protein-DNA and protein-RNA interaction of telomeres. TeloPIN database contains four types of interaction data, including (i) protein-protein interaction (PPI) data, (ii) telomeric proteins ChIP-seq data, (iii) telomere-associated proteins data and (iv) telomeric repeat-containing RNAs (TERRA)-interacting proteins data. By analyzing these four types of interaction data, we found that 358 and 199 proteins have more than one type of interaction information in human and mouse cells, respectively. We also developed table browser and TeloChIP genome browser to help researchers with better integrated visualization of interaction data from different studies. The current release of TeloPIN database includes 1111 PPI, eight telomeric protein ChIP-seq data sets, 1391 telomere-associated proteins and 183 TERRA-interacting proteins from 92 independent studies in mammalian cells. The interaction information provided by TeloPIN database will greatly expand our knowledge of telomeric proteins interaction network. Database URL: TeloPIN database address is TeloPIN database is freely available to non-commercial use. © The Author(s) 2015. Published by Oxford University Press.
    Database The Journal of Biological Databases and Curation 01/2015; 2015. DOI:10.1093/database/bav018
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Neuropeptides play a variety of roles in many physiological processes and serve as potential therapeutic targets for the treatment of some nervous-system disorders. In recent years, there has been a tremendous increase in the number of identified neuropeptides. Therefore, we have developed NeuroPep, a comprehensive resource of neuropeptides, which holds 5949 non-redundant neuropeptide entries originating from 493 organisms belonging to 65 neuropeptide families. In NeuroPep, the number of neuropeptides in invertebrates and vertebrates is 3455 and 2406, respectively. It is currently the most complete neuropeptide database. We extracted entries deposited in UniProt, the database ( and NeuroPedia, and used text mining methods to retrieve entries from the MEDLINE abstracts and full text articles. All the entries in NeuroPep have been manually checked. 2069 of the 5949 (35%) neuropeptide sequences were collected from the scientific literature. Moreover, NeuroPep contains detailed annotations for each entry, including source organisms, tissue specificity, families, names, post-translational modifications, 3D structures (if available) and literature references. Information derived from these peptide sequences such as amino acid compositions, isoelectric points, molecular weight and other physicochemical properties of peptides are also provided. A quick search feature allows users to search the database with keywords such as sequence, name, family, etc., and an advanced search page helps users to combine queries with logical operators like AND/OR. In addition, user-friendly web tools like browsing, sequence alignment and mapping are also integrated into the NeuroPep database. Database URL: © The Author(s) 2015. Published by Oxford University Press.
    Database The Journal of Biological Databases and Curation 01/2015; 2015. DOI:10.1093/database/bav038
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Identification of novel drug targets is a critical step in drug development. Many recent studies have produced multiple types of data, which provides an opportunity to mine the relationships among them to predict drug targets. In this study, we present a novel integrative approach that combines ontology reasoning with network-assisted gene ranking to predict new drug targets. We utilized colorectal cancer (CRC) as a proof-of-concept use case to illustrate the approach. Starting from FDA-approved CRC drugs and the relationships among disease, drug, gene, pathway, and SNP in an ontology representing PharmGKB data, we inferred 113 potential CRC drug targets. We further prioritized these genes based on their relationships with CRC disease genes in the context of human protein-protein interaction networks. Thus, among the 113 potential drug targets, 15 were selected as the promising drug targets, including some genes that are supported by previous studies. Among them, EGFR, TOP1 and VEGFA are known targets of FDA-approved drugs. Additionally, CCND1 (cyclin D1), and PTGS2 (prostaglandin-endoperoxide synthase 2) have reported to be relevant to CRC or as potential drug targets based on the literature search. These results indicate that our approach is promising for drug target prediction for CRC treatment, which might be useful for other cancer therapeutics. © The Author 2015. Published by Oxford University Press.
    Database The Journal of Biological Databases and Curation 01/2015; 2015. DOI:10.1093/database/bav015