Bioinformatics (BIOINFORMATICS)

Publisher: Oxford University Press (OUP)

Journal description

The journal aims to publish high quality peer-reviewed original scientific papers and excellent review articles in the fields of computational molecular biology biological databases and genome bioinformatics.

Current impact factor: 4.98

Impact Factor Rankings

2016 Impact Factor Available summer 2017
2014 / 2015 Impact Factor 4.981
2013 Impact Factor 4.621
2012 Impact Factor 5.323
2011 Impact Factor 5.468
2010 Impact Factor 4.877
2009 Impact Factor 4.926
2008 Impact Factor 4.328
2007 Impact Factor 5.039
2006 Impact Factor 4.894
2005 Impact Factor 6.019
2004 Impact Factor 5.742
2003 Impact Factor 6.701
2002 Impact Factor 4.615
2001 Impact Factor 3.421
2000 Impact Factor 3.409
1999 Impact Factor 2.259

Impact factor over time

Impact factor
Year

Additional details

5-year impact 8.14
Cited half-life 6.90
Immediacy index 1.17
Eigenfactor 0.20
Article influence 3.57
Website Bioinformatics website
Other titles Bioinformatics (Oxford, England: Online)
ISSN 1367-4811
OCLC 39184474
Material type Document, Periodical, Internet resource
Document type Internet Resource, Computer File, Journal / Magazine / Newspaper

Publisher details

Oxford University Press (OUP)

  • Pre-print
    • Author can archive a pre-print version
  • Post-print
    • Author cannot archive a post-print version
  • Restrictions
    • 12 months embargo
  • Conditions
    • Pre-print can only be posted prior to acceptance
    • Pre-print must be accompanied by set statement (see link)
    • Pre-print must not be replaced with post-print, instead a link to published version with amended set statement should be made
    • Pre-print on author's personal website, employer website, free public server or pre-prints in subject area
    • Post-print in Institutional repositories or Central repositories
    • Publisher's version/PDF cannot be used
    • Published source must be acknowledged
    • Must link to publisher version
    • Set phrase to accompany archived copy (see policy)
    • Eligible authors may deposit in OpenDepot
    • The publisher will deposit in PubMed Central on behalf of NIH authors
    • Publisher last contacted on 19/02/2015
    • This policy is an exception to the default policies of 'Oxford University Press (OUP)'
  • Classification
    yellow

Publications in this journal

  • [Show abstract] [Hide abstract]
    ABSTRACT: Motivation: Alignment-based taxonomic binning for metagenome characterization proceeds in two steps: reads mapping against a reference database (RDB) and taxonomic assignment according to the best hits. Beyond the sequencing technology and the completeness of the RDB, selecting the optimal configuration of the workflow, in particular the mapper parameters and the best hit selection threshold, to get the highest binning performance remains quite empirical. Results: We developed a statistical framework to perform such optimization at a minimal computational cost. Using an optimization experimental design and simulated datasets for three sequencing technologies, we built accurate prediction models for five performance indicators and then derived the parameter configuration providing the optimal performance. Whatever the mapper and the dataset, we observed that the optimal configuration yielded better performance than the default configuration and that the best hit selection threshold had a large impact on performance. Finally, on a reference dataset from the Human Microbiome Project, we confirmed that the optimized configuration increased the performance compared to the default configuration. Availability and implementation: Not applicable. Contact: magali.dancette@biomerieux.com SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
    No preview · Article · Feb 2016 · Bioinformatics
  • [Show abstract] [Hide abstract]
    ABSTRACT: Motivation: Transposon insertion sequencing (Tn-seq) is an emerging technology that combines transposon mutagenesis with next-generation sequencing technologies for the identification of genes related to bacterial survival. The resulting data from Tn-seq experiments consist of sequence reads mapped to millions of potential transposon insertion sites and a large portion of insertion sites have zero mapped reads. Novel statistical method for Tn-seq data analysis is needed to infer functions of genes on bacterial growth. Results: In this paper we propose a zero-inflated Poisson model for analyzing the Tn-seq data that are high-dimensional and with an excess of zeros. Maximum likelihood estimates of model parameters are obtained using an expectation-maximization (EM) algorithm, and pseudogenes are utilized to construct appropriate statistical tests for the transposon insertion tolerance of normal genes of interest. We propose a multiple testing procedure that categorizes genes into each of the three states, hypo-tolerant, tolerant, and hyper-tolerant, while controlling the false discovery rate. We evaluate the proposed method with simulation studies and apply the proposed method to a real Tn-seq data from an experiment that studied the bacterial pathogen, Campylobacter jejuni. Availability: The proposed method is implemented in R and the script is available at http://github.com/ffliu/TnSeq. Contact: pliu@iastate.edu SUPPLEMENTARY INFORMATION: Supplementary data is available at the journal's web site.
    No preview · Article · Feb 2016 · Bioinformatics
  • [Show abstract] [Hide abstract]
    ABSTRACT: Motivation: Advances of next generation sequencing technologies and availability of short read data enable the detection of structural variations (SVs). Deletions, an important type of SVs, have been suggested in association with genetic diseases. There are three types of deletions: blunt deletions, deletions with microhomologies and deletions with microsinsertions. The last two types are very common in the human genome, but they pose difficulty for the detection. Furthermore, finding deletions from sequencing data remains challenging. It is highly appealing to develop sensitive and accurate methods to detect deletions from sequencing data, especially deletions with microhomology and deletions with microinsertion. Results: We present a novel method called Sprites which finds deletions from sequencing data. It aligns a whole soft-clipping read rather than its clipped part to the target sequence, a segment of the reference which is determined by spanning reads, in order to find the longest prefix or suffix of the read that has a match in the target sequence. This alignment aims to solve the problem of deletions with microhomologies and deletions with microinsertions. Using both simulated and real data we show that Sprites performs better on detecting deletions compared to other current methods in terms of F-score. Availability: Sprites is open source software and freely available at https://github.com/zhangzhen/sprites. Contact: jxwang@mail.csu.edu.cn.
    No preview · Article · Feb 2016 · Bioinformatics
  • [Show abstract] [Hide abstract]
    ABSTRACT: Availability and implementation: The Geospatial Data Quality API is part of the VertNet set of APIs. It can be accessed at http://api-geospatial.vertnet-portal.appspot.com/geospatial and is already implemented in the VertNet data portal for quality reporting. Source code is freely available under GPL license from http://www.github.com/vertnet/api-geospatial. Contact: javier.otegui@gmail.com or rguralnick@flmnh.ufl.edu SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
    No preview · Article · Feb 2016 · Bioinformatics
  • [Show abstract] [Hide abstract]
    ABSTRACT: Availability: The package GiANT is available on CRAN. Contact: hans.kestler@leibniz-fli.de or hans.kestler@uni-ulm.de.
    No preview · Article · Feb 2016 · Bioinformatics
  • [Show abstract] [Hide abstract]
    ABSTRACT: Motivation: Simple forms of mutualism between microorganisms are widespread in nature. Nevertheless, the role played by the environmental nutrient composition in mediating cross-feeding in microbial ecosystems is still poorly understood. Results: Here, we use mixed-integer bilevel linear programming to investigate the cost of sharing metabolic resources in microbial communities. The algorithm infers an optimal combination of nutrients that can selectively sustain synergistic growth for a pair of species and guarantees minimum cost of cross-fed metabolites. To test model-based predictions, we selected a pair of Escherichia coli single gene knockouts auxotrophic respectively for arginine and leucine: ΔargB and ΔleuB and we experimentally verified that model-predicted medium composition significantly favors mutualism. Moreover, mass spectrometry profiling of exchanged metabolites confirmed the predicted cross-fed metabolites, supporting our constraint based modeling approach as a promising tool for engineering microbial consortia. Availability: The software is freely available as a matlab script in the supplementary materials. Contact: zampieri@imsb.biol.ethz.ch, sauer@imsb.biol.ethz.ch. Supplementary information: Supplementary data are available at Bioinformatics online.
    No preview · Article · Feb 2016 · Bioinformatics
  • [Show abstract] [Hide abstract]
    ABSTRACT: Motivation: The underlying relationship between genomic factors and the response of diverse cancer drugs still remains unclear. A number of studies showed that the heterogeneous responses to anticancer treatments of patients were partly associated with their specific changes in gene expression and somatic alterations. The emerging large-scale pharmacogenomic data provide us valuable opportunities to improve existing therapies or to guide early-phase clinical trials of compounds under development. However, how to identify the underlying combinatorial patterns among pharmacogenomics data is still a challenging issue. Results: In this study, we adopted a sparse network-regularized partial least square (SNPLS) method to identify joint modular patterns using large-scale pairwise gene-expression and drug-response data. We incorporated a molecular network to the (sparse) partial least square model to improve the module accuracy via a network-based penalty. We first demonstrated the effectiveness of SNPLS using a set of simulation data and compared it with two typical methods. Further, we applied it to gene expression profiles for 13321 genes and pharmacological profiles for 98 anticancer drugs across 641 cancer cell lines consisting of diverse types of human cancers. We identified 20 gene-drug co-modules, each of which consists of 30 cell lines, 137 genes and 2 drugs on average. The majority of identified co-modules have significantly functional implications and coordinated gene-drug associations. The modular analysis here provided us new insights into the molecular mechanisms of how drugs act and suggested new drug targets for therapy of certain types of cancers. Availability: A matlab package of SNPLS is available at http://page.amss.ac.cn/shihua.zhang/ CONTACT: zsh@amss.ac.cn.
    No preview · Article · Feb 2016 · Bioinformatics
  • [Show abstract] [Hide abstract]
    ABSTRACT: Availability: BCFtools/RoH and its associated binary/source files are freely available from https://github.com/samtools/BCFtools. Contact: vn2@sanger.ac.uk, pd3@sanger.ac.uk SUPPLEMENTAL INFORMATION: Online-only supplementary data is available at the journal's web site.
    No preview · Article · Jan 2016 · Bioinformatics
  • [Show abstract] [Hide abstract]
    ABSTRACT: Motivation: We address a common problem in large-scale data analysis, and especially the field of genetics, the huge-scale testing problem, where millions to billions of hypotheses are tested together creating a computational challenge to control the inflation of the false discovery rate. As a solution we propose an alternative algorithm for the famous Linear Step Up procedure of Benjamini and Hochberg (1995). Results: Our algorithm requires linear time and does not require any p-value ordering. It permits separating huge-scale testing problems arbitrarily into computationally feasible sets or chunks. Results from the chunks are combined by our algorithm to produce the same results as the controlling procedure on the entire set of tests, thus controlling the global false discovery rate even when p-values are arbitrarily divided. The practical memory usage may also be determined arbitrarily by the size of available memory. Availability and implementation: R code is provided in the supplementary material. Contact: sbatista@cs.princeton.edu.
    No preview · Article · Jan 2016 · Bioinformatics
  • [Show abstract] [Hide abstract]
    ABSTRACT: Motivation: Developments in biotechnology have enabled the in vitro evolution of binding proteins. The emerging limitations of antibodies in binding protein engineering have led to suggestions for other proteins as alternative binding protein scaffolds. Most of these proteins were selected based on human intuition rather than systematic analysis of the available data. To improve this strategy, we developed a computational framework for finding desirable binding protein scaffolds by utilizing protein structure and sequence information. Results: For each protein, its structure and the sequences of evolutionarily related proteins were analyzed, and spatially contiguous regions composed of highly variable residues were identified. A large number of proteins have these regions, but leucine rich repeats (LRRs), histidine kinase domains, and immunoglobulin domains are predominant among them. The candidates suggested as new binding protein scaffolds include histidine kinase, LRR, titin, and pentapeptide repeat protein. Availability: The database and web-service are accessible via http://bcbl.kaist.ac.kr/LibBP. Contact: kds@kaist.ac.kr.
    No preview · Article · Jan 2016 · Bioinformatics
  • [Show abstract] [Hide abstract]
    ABSTRACT: Availability and implementation: FILTUS is written in Python and runs on Windows, Mac and Linux. Binaries and source code are freely available at http://folk.uio.no/magnusv/filtus.html and on GitHub: https://github.com/magnusdv/filtus. Automatic installation is available via PyPI (e.g. pip install filtus). Contact: magnusdv@medisin.uio.no SUPPLEMENTARY INFORMATION: Supplementary material is available at Bioinformatics online.
    No preview · Article · Jan 2016 · Bioinformatics
  • [Show abstract] [Hide abstract]
    ABSTRACT: Availability and implementation: The source is freely available under a GPL license on GitHub, along with user documentation and pre-compiled binaries and instructions for several platforms: https://github.com/Tarostar/QMLGalaxyPortal. It is available for iOS version 7 (and newer) through the Apple App Store, and for Android through Google Play for version 4.1 (API 16) or newer. Contact: geirksa@ifi.uio.no.
    No preview · Article · Jan 2016 · Bioinformatics
  • [Show abstract] [Hide abstract]
    ABSTRACT: Unipept is an open source web application that is designed for metaproteomics analysis with a focus on interactive datavisualization. It is underpinned by a fast index built from UniProtKB and the NCBI taxonomy that enables quick retrieval of all UniProt entries in which a given tryptic peptide occurs. Unipept version 2.4 introduced web services that provide programmatic access to the metaproteomics analysis features. This enables integration of Unipept functionality in custom applications and data processing pipelines. Availability and Implementation The web services are freely available at http://api.unipept.ugent.be and are open sourced under the MIT license.
    No preview · Article · Jan 2016 · Bioinformatics
  • [Show abstract] [Hide abstract]
    ABSTRACT: Motivation: DNA methylation is an epigenetic modification with important roles in many biological processes and diseases. Bisulfite sequencing (BS-seq) has emerged recently as the technology of choice to profile DNA methylation because of its accuracy, genome coverage, and higher resolution. Current statistical methods to identify differential methylation mainly focus on comparing two treatment groups. With an increasing number of experiments performed under a general and multiple-factor design, particularly in reduced representation bisulfite sequencing, there is a need to develop more flexible, powerful and computationally efficient methods. Results: We present a novel statistical model to detect differentially methylated loci from BS-seq data under general experimental design, based on a beta-binomial regression model with "arcsine" link function. Parameter estimation is based on transformed data with generalized least square approach without relying on iterative algorithm. Simulation and real data analyses demonstrate that our method is accurate, powerful, robust and computationally efficient. Availability: It is available as Bioconductor package DSS. Supplementary information: Supplementary data are available at Bioinformatics online. Contact: yongpark@pitt.edu, hao.wu@emory.edu.
    No preview · Article · Jan 2016 · Bioinformatics
  • [Show abstract] [Hide abstract]
    ABSTRACT: Availability and implementation: BioCircos.js and its manual are freely available online at http://bioinfo.ibp.ac.cn/biocircos/. Contact: rschen@ibp.ac.cn.
    No preview · Article · Jan 2016 · Bioinformatics
  • [Show abstract] [Hide abstract]
    ABSTRACT: Motivation: The goal of deciphering the human glycome has been hindered by the lack of high-throughput sequencing methods for glycans. While mass spectrometry (MS) is a key technology in glycan sequencing, MS alone provides limited information about the identification of monosaccharide constituents, their anomericity and their linkages. These features of individual, purified glycans can be partly identified using well-defined glycan-binding proteins, such as lectins and antibodies that recognize specific determinants within glycan structures. Results: We present a novel computational approach to automate the sequencing of glycans using metadata-assisted glycan sequencing (MAGS), which combines MS analyses with glycan structural information from glycan microarray technology. Success in this approach was aided by the generation of a "virtual glycome" to represent all potential glycan structures that might exist within a metaglycomes based on a set of biosynthetic assumptions using known structural information. We exploited this approach to deduce the structures of soluble glycans within the human milk glycome by matching predicted structures based on experimental data against the virtual glycome. This represents the first meta-glycome to be defined using this method and we provide a publically available web-based application to aid in sequencing milk glycans. Availability: http://glycomeseq.emory.edu CONTACT: sagravat@bidmc.harvard.edu SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
    No preview · Article · Jan 2016 · Bioinformatics
  • [Show abstract] [Hide abstract]
    ABSTRACT: Motivation: Structured sparse canonical correlation analysis (SCCA) models have been used to identify imaging genetic associations. These models either use group lasso or graph-guided fused lasso to conduct feature selection and feature grouping simultaneously. The group lasso based methods require prior knowledge to define the groups, which limits the capability when prior knowledge is incomplete or unavailable. The graph-guided methods overcome this drawback by using the sample correlation to define the constraint. However, they are sensitive to the sign of the sample correlation, which could introduce undesirable bias if the sign is wrongly estimated. Results: We introduce a novel SCCA with a new penalty, and develop an efficient optimization algorithm. Our method has a strong upper bound for the grouping effect for both positively and negatively correlated features. We show that our method performs better than or equally to three competing SCCA models on both synthetic and real data. In particular, our method identifies stronger canonical correlations and better canonical loading patterns, showing its promise for revealing interesting imaging genetic associations. Availability: The Matlab code and sample data are freely available at http://www.iu.edu/shenlab/tools/angscca/. Contact: Li Shen: shenli@iu.edu.
    No preview · Article · Jan 2016 · Bioinformatics
  • [Show abstract] [Hide abstract]
    ABSTRACT: Availability: The FamAgg package is freely available at the Bioconductor repository, http://www.bioconductor.org/packages/FamAgg. Contact: Christian.Weichenberger@eurac.edu SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
    No preview · Article · Jan 2016 · Bioinformatics