Bioinformatics (BIOINFORMATICS)
Description
The journal aims to publish high quality peer-reviewed original scientific papers and excellent review articles in the fields of computational molecular biology biological databases and genome bioinformatics.
- Impact factor5.47Show impact factor historyImpact factorYear
- WebsiteBioinformatics website
-
Other titlesBioinformatics (Oxford, England: Online)
-
ISSN1367-4803
-
OCLC39184474
-
Material typeDocument, Periodical, Internet resource
-
Document typeInternet Resource, Computer File, Journal / Magazine / Newspaper
Publisher details
-
Pre-print
- Author can archive a pre-print version
-
Post-print
- Author cannot archive a post-print version
-
Restrictions
- 12 month embargo on science, technology, medicine articles
- 24 month embargo on arts and humanities articles
- Some titles may have different embargoes
-
Conditions
- Pre-print can only be posted prior to acceptance
- Pre-print must be accompanied by set statement (see link)
- Pre-print must not be replaced with post-print, instead a link to published version with amended set statement should be made
- Pre-print on personal website, employer website, free public server or pre-prints in subject area
- Post-print on Institutional or Central repositories
- Publisher version cannot be used except for Nucleic Acids Research articles
- Published source must be acknowledged
- Must link to publisher version
- Set phrase to accompany archived copy (see policy)
- Articles in some journals can be made Open Access on payment of additional charge
- Eligible UK authors may deposit in OpenDepot
- Publisher will deposit on behalf of NIH funded authors to PubMed Central, Nucleic Acids Research authors must pay their fee first
- Some titles may use different policies
-
Classification yellow
Publications in this journal
-
Article: imDEV: a Graphical User Interface to R Multivariate Analysis Tools in Microsoft Excel
[show abstract] [hide abstract]
ABSTRACT: Summary: Interactive modules for Data Exploration and Visualization (imDEV) is a Microsoft Excel spreadsheet embedded application providing an integrated environment for the analysis of omics data through a user-friendly interface. Individual modules enable interactive and dynamic analyses of large data by interfacing R’s multivariate statistics and highly customizable visualizations with the spreadsheet environment, aiding robust inferences and generating information rich data visualizations. This tool provides access to multiple comparisons with false discovery correction, hierarchical clustering, principal (PCA) and independent component analyses (ICA), partial least squares regression (PLS) and discriminant analysis (PLS-DA), through an intuitive interface for creating high quality two and a threedimensional visualizations including scatter plot matrices, distribution plots, dendrograms, heat maps, biplots, trellis biplots and correlation networks. Availability and Implementation: Freely available for download at http://sourceforge.net/projects/imdev/. Implemented in R and VBA and supported by Microsoft Excel (2003, 2007 and 2010). Contact: John W. Newman – John.Newman@ars.usda.gov Supplementary information: Installation instructions, tutorials, and users manual are available at http://sourceforge.net/apps/mediawiki/imdev.Bioinformatics 07/2012; 28(17):2288-90. -
Article: Lin YS, Lin CC, Tsai YS, Ku TC, Huang YH, Hsu CN. A spectral graph theoretic approach to quantification and calibration of collective morphological differences in cell images.Bioinformatics. 2010 Jun 15;26(12):i29-37. doi: 10.1093/bioinformatics/btq194. (First author)
Bioinformatics 06/2012; -
Article: GRAST: a new way of genome reduction analysis using comparative genomics.
[show abstract] [hide abstract]
ABSTRACT: MOTIVATION: Establishment of intra-cellular life involved a profound re-configuration of the genetic characteristics of bacteria, including genome reduction and rearrangements. Understanding the mechanisms underlying these phenomena will shed light on the genome rearrangements essential for the development of an intra-cellular lifestyle. Comparison of genomes with differences in their sizes poses statistical as well as computational problems. Little efforts have been made to develop flexible computational tools with which to analyse genome reduction and rearrangements. RESULTS: Investigation of genome reduction and rearrangements in endosymbionts using a novel computational tool (GRAST) identified gathering of genes with similar functions. Conserved clusters of functionally related genes (CGSCs) were detected. Heterogeneous gene and gene cluster non-functionalization/loss are identified between genome regions, functional gene categories and during evolution. Results show that gene non-functionalisation has accelerated during the last 50 MY of Buchnera's evolution while CGSCs have been static.Bioinformatics 08/2006; 22(13):1551-61. -
Article: Integrative Array Analyzer: a software package for analysis of cross-platform and cross-species microarray data.
[show abstract] [hide abstract]
ABSTRACT: The rapid accumulation of microarray data translates into an urgent need for tools to perform integrative microarray analysis. Integrative Array Analyzer is a comprehensive analysis and visualization software toolkit, which aims to facilitate the reuse of the large amount of cross-platform and cross-species microarray data. It is composed of the data preprocess module, the co-expression analysis module, the differential expression analysis module, the functional and transcriptional annotation module and the graph visualization module.Bioinformatics 08/2006; 22(13):1665-7. -
Article: Map2mod--a server for evaluation of crystallographic models and their agreement with electron density maps.
[show abstract] [hide abstract]
ABSTRACT: Here we report on recent developments of the map2mod server. It has been designed for validation of protein models created by X-ray data interpretation. It can also be used during the refinement process since it is able to indicate problem regions in the model. Apart from evaluation of model quality, it has an option to remove atoms of side chains, which are not consistent with the maps as well as improperly placed water molecules. There are two additional options: checking the B-factors of atoms in the provided model and comparison of R and R(free) values obtained as the result of refinement with the averages characteristic for the data resolution shell.Bioinformatics 08/2006; 22(13):1660-1. -
Article: Improving MHC binding peptide prediction by incorporating binding data of auxiliary MHC molecules.
[show abstract] [hide abstract]
ABSTRACT: MOTIVATION: Various computational methods have been proposed to tackle the problem of predicting the peptide binding ability for a specific MHC molecule. These methods are based on known binding peptide sequences. However, current available peptide databases do not have very abundant amounts of examples and are highly redundant. Existing studies show that MHC molecules can be classified into supertypes in terms of peptide-binding specificities. Therefore, we first give a method for reducing the redundancy in a given dataset based on information entropy, then present a novel approach for prediction by learning a predictive model from a dataset of binders for not only the molecule of interest but also for other MHC molecules. RESULTS: We experimented on the HLA-A family with the binding nonamers of A1 supertype (HLA-A*0101, A*2601, A*2902, A*3002), A2 supertype (A*0201, A*0202, A*0203, A*0206, A*6802), A3 supertype (A*0301, A*1101, A*3101, A*3301, A*6801) and A24 supertype (A*2301 and A*2402), whose data were collected from six publicly available peptide databases and two private sources. The results show that our approach significantly improves the prediction accuracy of peptides that bind a specific HLA molecule when we combine binding data of HLA molecules in the same supertype. Our approach can thus be used to help find new binders for MHC molecules.Bioinformatics 08/2006; 22(13):1648-55. -
Article: An online literature mining tool for protein phosphorylation.
[show abstract] [hide abstract]
ABSTRACT: A web-based version of the RLIMS-P literature mining system was developed for online mining of protein phosphorylation information from MEDLINE abstracts. The online tool presents extracted phosphorylation objects (phosphorylated proteins, phosphorylation sites and protein kinases) in summary tables and full reports with evidence-tagged abstracts. The tool further allows mapping of phosphorylated proteins to protein entries in the UniProt Knowledgebase based on PubMed ID and/or protein name. The literature mining, coupled with database association, allows retrieval of rich biological information for the phosphorylated proteins and facilitates database annotation of phosphorylation features.Bioinformatics 08/2006; 22(13):1668-9. -
Article: Identification of biochemical networks by S-tree based genetic programming.
[show abstract] [hide abstract]
ABSTRACT: MOTIVATION: Most previous approaches to model biochemical networks have focused either on the characterization of a network structure with a number of components or on the estimation of kinetic parameters of a network with a relatively small number of components. For system-level understanding, however, we should examine both the interactions among the components and the dynamic behaviors of the components. A key obstacle to this simultaneous identification of the structure and parameters is the lack of data compared with the relatively large number of parameters to be estimated. Hence, there are many plausible networks for the given data, but most of them are not likely to exist in the real system. RESULTS: We propose a new representation named S-trees for both the structural and dynamical modeling of a biochemical network within a unified scheme. We further present S-tree based genetic programming to identify the structure of a biochemical network and to estimate the corresponding parameter values at the same time. While other evolutionary algorithms require additional techniques for sparse structure identification, our approach can automatically assemble the sparse primitives of a biochemical network in an efficient way. We evaluate our algorithm on the dynamic profiles of an artificial genetic network. In 20 trials for four settings, we obtain the true structure and their relative squared errors are <5% regardless of releasing constraints about structural sparseness. In addition, we confirm that the proposed algorithm is robust within +/-10% noise ratio. Furthermore, the proposed approach ensures a reasonable estimate of a real yeast fermentation pathway. The comparatively less important connections with non-zero parameters can be detected even though their orders are below 10(-2). To demonstrate the usefulness of the proposed algorithm for real experimental biological data, we provide an additional example on the transcriptional network of SOS response to DNA damage in Escherichia coli. We confirm that the proposed algorithm can successfully identify the true structure except only one relation.Bioinformatics 08/2006; 22(13):1631-40. -
Article: GAME: detecting cis-regulatory elements using a genetic algorithm.
[show abstract] [hide abstract]
ABSTRACT: MOTIVATION: Identification of a transcription factor binding sites is an important aspect of the analysis of genetic regulation. Many programs have been developed for the de novo discovery of a binding motif (collection of binding sites). Recently, a scoring function formulation was derived that allows for the comparison of discovered motifs from different programs [S.T. Jensen, X.S. Liu, Q. Zhou and J.S. Liu (2004) Stat. Sci., 19, 188-204.] A simple program, BioOptimizer, was proposed in [S.T. Jensen and J.S. Liu (2004) Bioinformatics, 20, 1557-1564.] that improved discovered motifs by optimizing a scoring function. However, BioOptimizer is a very simple algorithm that can only make local improvements upon an already discovered motif and so BioOptimizer can only be used in conjunction with other motif-finding software. RESULTS: We introduce software, GAME, which utilizes a genetic algorithm to find optimal motifs in DNA sequences. GAME evolves motifs with high fitness from a population of randomly generated starting motifs, which eliminate the reliance on additional motif-finding programs. In addition to using standard genetic operations, GAME also incorporates two additional operators that are specific to the motif discovery problem. We demonstrate the superior performance of GAME compared with MEME, BioProspector and BioOptimizer in simulation studies as well as several real data applications where we use an extended version of the GAME algorithm that allows the motif width to be unknown.Bioinformatics 08/2006; 22(13):1577-84. -
Article: Intrinsically disordered C-terminal segments of voltage-activated potassium channels: a possible fishing rod-like mechanism for channel binding to scaffold proteins.
[show abstract] [hide abstract]
ABSTRACT: Membrane-embedded voltage-activated potassium channels (Kv) bind intracellular scaffold proteins, such as the Post Synaptic Density 95 (PSD-95) protein, using a conserved PDZ-binding motif located at the channels' C-terminal tip. This interaction underlies Kv-channel clustering, and is important for the proper assembly and functioning of the synapse. Here we demonstrate that the C-terminal segments of Kv channels adjacent to the PDZ-binding motif are intrinsically disordered. Phylogenetic analysis of the Kv channel family reveals a cluster of channel sequences belonging to three out of the four main channel families, for which an association is demonstrated between the presence of the consensus terminal PDZ-binding motif and the intrinsically disordered nature of the immediately adjacent C-terminal segment. Our observations, combined with a structural analogy to the N-terminal intra-molecular ball-and-chain mechanism for Kv channel inactivation, suggest that the C-terminal disordered segments of these channel families encode an inter-molecular fishing rod-like mechanism for K(+) channel binding to scaffold proteins.Bioinformatics 08/2006; 22(13):1546-50. -
Article: Computational recognition of potassium channel sequences.
[show abstract] [hide abstract]
ABSTRACT: MOTIVATION: Potassium channels are mainly known for their role in regulating and maintaining the membrane potential. Since this is one of the key mechanisms of signal transduction, malfunction of these potassium channels leads to a wide variety of severe diseases. Thus potassium channels are priority targets of research for new drugs, despite the fact that this protein family is highly variable and closely related to other channels, which makes it very difficult to identify new types of potassium channel sequences. RESULTS: Here we present a new method for identifying potassium channel sequences (PSM, Property Signature Method), which-in contrast to the known methods for protein classification-is directly based on physicochemical properties of amino acids rather than on the amino acids themselves. A signature for the pore region including the selectivity filter has been created, representing the most common physicochemical properties of known potassium channels. This string enables genome-wide screening for sequences with similar features despite a very low degree of amino acid similarity within a protein family.Bioinformatics 08/2006; 22(13):1562-8. -
Article: Integrated analysis of transcriptomic and proteomic data of Desulfovibrio vulgaris: zero-inflated Poisson regression models to predict abundance of undetected proteins.
[show abstract] [hide abstract]
ABSTRACT: MOTIVATION: Integrated analysis of global scale transcriptomic and proteomic data can provide important insights into the metabolic mechanisms underlying complex biological systems. However, because the relationship between protein abundance and mRNA expression level is complicated by many cellular and physical processes, sophisticated statistical models need to be developed to capture their relationship. Results: In this study, we describe a novel data-driven statistical model to integrate whole-genome microarray and proteomic data collected from Desulfovibrio vulgaris grown under three different conditions. Based on the Poisson distribution pattern of proteomic data and the fact that a large number of proteins were undetected (excess zeros), zero-inflated Poisson (ZIP)-based models were proposed to define the correlation pattern between mRNA and protein abundance. In addition, by assuming that there is a probability mass at zero representing unexpressed genes and expressed proteins that were undetected owing to technical limitations, a Potential ZIP model was established. Two significant improvements introduced by this approach are (1) the predicted protein abundance level values for experimentally detected proteins are corrected by considering their mRNA levels and (2) protein abundance values can be predicted for undetected proteins (in the case of this study, approximately 83% of the proteins in the D.vulgaris genome) for better biological interpretation. We demonstrated the use of these statistical models by comparatively analyzing proteomic and microarray results from D.vulgaris grown on lactate-based versus formate-based media. These models correctly predicted increased expression of Ech hydrogenase and decreased expression of Coo hydrogenase for D.vulgaris grown on formate.Bioinformatics 08/2006; 22(13):1641-7. -
Article: VizStruct for visualization of genome-wide SNP analyses.
[show abstract] [hide abstract]
ABSTRACT: MOTIVATION: The size, dimensionality and the limited range of the data values make visualization of single nucleotide polymorphism (SNP) datasets challenging. The purpose of this study is to evaluate the usefulness of 3D VizStruct, a novel multi-dimensional data visualization technique for analyzing patterns in SNP datasets. RESULTS: VizStruct is an interactive visualization technique that reduces multi-dimensional data to two dimensions using the complex-valued harmonics of the discrete Fourier transform (DFT). In the 3D VizStruct extension, the multi-dimensional SNP data vectors are reduced to three dimensions using a combination of the DFT and the Kullback-Leibler divergence. The performance of 3D VizStruct was challenged with several biologically relevant published datasets that included human Chromosome 21, the human lipoprotein lipase (LPL) gene locus and the multi-locus genotypes of coral populations. In every case, the 3D VizStruct mapping provided an intuitive visual description of the key characteristics of the underlying multi-dimensional genotype.Bioinformatics 08/2006; 22(13):1569-76. -
Article: Integrating multi-attribute similarity networks for robust representation of the protein space.
[show abstract] [hide abstract]
ABSTRACT: MOTIVATION: A global view of the protein space is essential for functional and evolutionary analysis of proteins. In order to achieve this, a similarity network can be built using pairwise relationships among proteins. However, existing similarity networks employ a single similarity measure and therefore their utility depends highly on the quality of the selected measure. A more robust representation of the protein space can be realized if multiple sources of information are used. RESULTS: We propose a novel approach for analyzing multi-attribute similarity networks by combining random walks on graphs with Bayesian theory. A multi-attribute network is created by combining sequence and structure based similarity measures. For each attribute of the similarity network, one can compute a measure of affinity from a given protein to every other protein in the network using random walks. This process makes use of the implicit clustering information of the similarity network, and we show that it is superior to naive, local ranking methods. We then combine the computed affinities using a Bayesian framework. In particular, when we train a Bayesian model for automated classification of a novel protein, we achieve high classification accuracy and outperform single attribute networks. In addition, we demonstrate the effectiveness of our technique by comparison with a competing kernel-based information integration approach.Bioinformatics 08/2006; 22(13):1585-92. -
Article: Query Chem: a Google-powered web search combining text and chemical structures.
[show abstract] [hide abstract]
ABSTRACT: Query Chem (www.QueryChem.com) is a Web program that integrates chemical structure and text-based searching using publicly available chemical databases and Google's Web Application Program Interface (API). Query Chem makes it possible to search the Web for information about chemical structures without knowing their common names or identifiers. Furthermore, a structure can be combined with textual query terms to further restrict searches. Query Chem's search results can retrieve many interesting structure-property relationships of biomolecules on the Web.Bioinformatics 08/2006; 22(13):1670-3. -
Article: STRAL: progressive alignment of non-coding RNA using base pairing probability vectors in quadratic time.
[show abstract] [hide abstract]
ABSTRACT: MOTIVATION: Alignment of RNA has a wide range of applications, for example in phylogeny inference, consensus structure prediction and homology searches. Yet aligning structural or non-coding RNAs (ncRNAs) correctly is notoriously difficult as these RNA sequences may evolve by compensatory mutations, which maintain base pairing but destroy sequence homology. Ideally, alignment programs would take RNA structure into account. The Sankoff algorithm for the simultaneous solution of RNA structure prediction and RNA sequence alignment was proposed 20 years ago but suffers from its exponential complexity. A number of programs implement lightweight versions of the Sankoff algorithm by restricting its application to a limited type of structure and/or only pairwise alignment. Thus, despite recent advances, the proper alignment of multiple structural RNA sequences remains a problem. RESULTS: Here we present StrAl, a heuristic method for alignment of ncRNA that reduces sequence-structure alignment to a two-dimensional problem similar to standard multiple sequence alignment. The scoring function takes into account sequence similarity as well as up- and downstream pairing probability. To test the robustness of the algorithm and the performance of the program, we scored alignments produced by StrAl against a large set of published reference alignments. The quality of alignments predicted by StrAl is far better than that obtained by standard sequence alignment programs, especially when sequence homologies drop below approximately 65%; nevertheless StrAl's runtime is comparable to that of ClustalW.Bioinformatics 08/2006; 22(13):1593-9.
Data provided are for informational purposes only. Although carefully collected, accuracy cannot be guaranteed. The impact factor represents a rough estimation of the journal's impact factor and does not reflect the actual current impact factor. Publisher conditions are provided by RoMEO. Differing provisions from the publisher's actual policy or licence agreement may be applicable.
Keywords
Related Journals
Scientific Reports
ISSN: 2045-2322
IEEE Transactions on Image Processing
IEEE Signal Processing Society;...
ISSN: 1941-0042, Impact factor: 3.04
IEEE Transactions on Software Engineering
IEEE Computer Society, Institute of...
ISSN: 1939-3539, Impact factor: 1.98
Computers in biology and medicine
Elsevier
ISSN: 1879-0534, Impact factor: 1.27
Current opinion in biotechnology
Elsevier
ISSN: 1879-0429, Impact factor: 7.82
Current Opinion in Structural Biology
Elsevier
ISSN: 1879-033X, Impact factor: 9.42
Neuropharmacology
Elsevier
ISSN: 1873-7064, Impact factor: 4.81
Magnetic Resonance Imaging
Elsevier
ISSN: 1873-5894, Impact factor: 1.99