Journal of Biomedical Informatics (J Biomed Informat)

Publisher: Elsevier

Journal description

The Journal of Biomedical Informatics (formerly Computers and Biomedical Research) has been redesigned to reflect a commitment to high-quality original research papers and reviews in the area of biomedical informatics. Although published articles are motivated by applications in the biomedical sciences (for example, clinical medicine, health care, population health, imaging, and bioinformatics), the journal emphasizes reports of new methodologies and techniques that have general applicability and that form the basis for the evolving science of biomedical informatics. Articles on medical devices, and formal evaluations of completed systems, including clinical trials of information technologies, would generally be more suitable for publication in other venues. System descriptions are welcome if they illustrate and substantiate the underlying methodology that is the principal focus of the report.

Current impact factor: 2.19

Impact Factor Rankings

2015 Impact Factor Available summer 2016
2014 Impact Factor 2.194
2013 Impact Factor 2.482
2012 Impact Factor 2.131
2011 Impact Factor 1.792
2010 Impact Factor 1.719
2009 Impact Factor 2.432
2008 Impact Factor 1.924
2007 Impact Factor 2
2006 Impact Factor 2.346
2005 Impact Factor 2.388
2004 Impact Factor 1.013
2003 Impact Factor 0.855
2002 Impact Factor 0.862

Impact factor over time

Impact factor

Additional details

5-year impact 3.44
Cited half-life 5.20
Immediacy index 0.61
Eigenfactor 0.01
Article influence 1.14
Website Journal of Biomedical Informatics website
Other titles Journal of biomedical informatics (Online)
ISSN 1532-0480
OCLC 45147742
Material type Document, Periodical, Internet resource
Document type Internet Resource, Computer File, Journal / Magazine / Newspaper

Publisher details


  • Pre-print
    • Author can archive a pre-print version
  • Post-print
    • Author can archive a post-print version
  • Conditions
    • Authors pre-print on any website, including arXiv and RePEC
    • Author's post-print on author's personal website immediately
    • Author's post-print on open access repository after an embargo period of between 12 months and 48 months
    • Permitted deposit due to Funding Body, Institutional and Governmental policy or mandate, may be required to comply with embargo periods of 12 months to 48 months
    • Author's post-print may be used to update arXiv and RepEC
    • Publisher's version/PDF cannot be used
    • Must link to publisher version with DOI
    • Author's post-print must be released with a Creative Commons Attribution Non-Commercial No Derivatives License
    • Publisher last reviewed on 03/06/2015
  • Classification
    ​ green

Publications in this journal

  • [Show abstract] [Hide abstract]
    ABSTRACT: Advances in the medical field have increased the need to incorporate modern techniques into surgical resident training and surgical skills learning. To facilitate this integration, one approach that has gained credibility is the incorporation of simulator based training to supplement traditional training programs. However, existing implementations of these training methods still require the constant presence of a competent surgeon to assess the surgical dexterity of the trainee, which limits the evaluation methods and relies on subjective evaluation. This research proposes an efficient, effective, and economic video-based skill assessment technique for minimally invasive surgery (MIS). It analyzes a surgeon's hand and surgical tool movements and detects features like smoothness, efficiency, and preciseness. The system is capable of providing both real time on-screen feedback and a performance score at the end of the surgery. Finally, we present a web-based tool where surgeons can securely upload MIS training videos and receive evaluation scores and an analysis of trainees' performance trends over time.
    Journal of Biomedical Informatics 11/2015; DOI:10.1016/j.jbi.2015.11.002
  • [Show abstract] [Hide abstract]
    ABSTRACT: Self-reported patient data has been shown to be a valuable knowledge source for post-market pharmacovigilance. In this paper we propose using the popular micro-blogging service Twitter to gather evidence about adverse drug reactions (ADRs) after firstly having identified micro-blog messages (also know as "tweets") that report first-hand experience. In order to achieve this goal we explore machine learning with data crowdsourced from laymen annotators. With the help of lay annotators recruited from CrowdFlower we manually annotated 1548 tweets containing keywords related to two kinds of drugs: SSRIs (eg. Paroxetine), and cognitive enhancers (eg. Ritalin). Our results show that inter-annotator agreement (Fleiss' kappa) for crowdsourcing ranks in moderate agreement with a pair of experienced annotators (Spearman's Rho =0.471). We utilized the gold standard annotations from CrowdFlower for automatically training a range of supervised machine learning models to recognize first-hand experience. F-Score values are reported for 6 of these techniques with the Bayesian Generalized Linear Model being the best (F-Score=0.64 and Informedness=0.43) when combined with a selected set of features obtained by using information gain criteria.
    Journal of Biomedical Informatics 11/2015; DOI:10.1016/j.jbi.2015.11.004
  • [Show abstract] [Hide abstract]
    ABSTRACT: Gene selection from high-dimensional microarray gene-expression data is statistically a challenging problem. Filter approaches to gene selection have been popular because of their simplicity, efficiency, and accuracy. Due to small sample size, all samples are generally used to compute relevant ranking statistics and selection of samples in filter-based gene selection methods has not been addressed. In this paper, we extend previously-proposed simultaneous sample and gene selection approach. In a backward elimination method, a modified logistic regression loss function is used to select relevant samples at each iteration, and these samples are used to compute the T-score to rank genes. This method provides a compromise solution between T-score and other support vector machine (SVM) based algorithms. The performance is demonstrated on both simulated and real datasets with criteria such as classification performance, stability and redundancy. Results indicate that computational complexity and stability of the method are improved compared to SVM based methods without compromising the classification performance.
    Journal of Biomedical Informatics 11/2015; DOI:10.1016/j.jbi.2015.11.003

  • Journal of Biomedical Informatics 11/2015; DOI:10.1016/j.jbi.2015.11.001
  • [Show abstract] [Hide abstract]
    ABSTRACT: This paper introduces a new, model-based design method for interactive health information technology (IT) systems. This method extends workflow models with models of conceptual work products. When the health care work being modeled is substantially cognitive, tacit, and complex in nature, graphical workflow models can become too complex to be useful to designers. Conceptual models complement and simplify workflows by providing an explicit specification for the information product they must produce. We illustrate how conceptual work products can be modeled using standard software modeling language, which allows them to provide fundamental requirements for what the workflow must accomplish and the information that a new system should provide. Developers can use these specifications to envision how health IT could enable an effective cognitive strategy as a workflow with precise information requirements. We illustrate the new method with a study conducted in an outpatient multiple sclerosis (MS) clinic. This study shows specifically how the different phases of the method can be carried out, how the method allows for iteration across phases, and how the method generated a health IT design that is efficient and easy to use.
    Journal of Biomedical Informatics 11/2015; DOI:10.1016/j.jbi.2015.10.014
  • [Show abstract] [Hide abstract]
    ABSTRACT: The causal and interplay mechanisms of Single Nucleotide Polymorphisms (SNPs) associated with complex diseases (complex disease SNPs) investigated in genome-wide association studies (GWAS) at the transcriptional level (mRNA) are poorly understood despite recent advancements such as discoveries reported in the Encyclopedia of DNA Elements (ENCODE) and Genotype-Tissue Expression (GTex). Protein interaction network analyses have successfully improved our understanding of both single gene diseases (Mendelian diseases) and complex diseases. Whether the mRNAs downstream of complex disease genes are central or peripheral in the genetic information flow relating DNA to mRNA remains unclear and may be disease-specific. Using expression Quantitative Trait Loci (eQTL) that provide DNA to mRNA associations and network centrality metrics, we hypothesize that we can unveil the systems properties of information flow between SNPs and the transcriptomes of complex diseases. We compare different conditions such as naïve SNP assignments and stringent linkage disequilibrium (LD) free assignments for transcripts to remove confounders from LD. Additionally, we compare the results from eQTL networks between lymphoblastoid cell lines and liver tissue. Empirical permutation resampling (p<0.001) and theoretic Mann-Whitney U test (p<10(-30)) statistics indicate that mRNAs corresponding to complex disease SNPs via eQTL associations are likely to be regulated by a larger number of SNPs than expected. We name this novel property mRNA hubness in eQTL networks, and further term mRNAs with high hubness as master integrators. mRNA master integrators receive and coordinate the perturbation signals from large numbers of polymorphisms and respond to the personal genetic architecture integratively. This genetic signal integration contrasts with the mechanism underlying some Mendelian diseases, where a genetic polymorphism affecting a single protein hub produces a divergent signal that affects a large number of downstream proteins. Indeed, we verify that this property is independent of the hubness in protein networks for which these mRNAs are transcribed. Our findings provide novel insights into the pleiotropy of mRNAs targeted by complex disease polymorphisms and the architecture of the information flow between the genetic polymorphisms and transcriptomes of complex diseases.
    Journal of Biomedical Informatics 11/2015; DOI:10.1016/j.jbi.2015.10.010

  • Journal of Biomedical Informatics 11/2015; DOI:10.1016/j.jbi.2015.10.007
  • [Show abstract] [Hide abstract]
    ABSTRACT: Alternative splicing is an important component of tumorigenesis. Recent advent of exon array technology enables the detection of alternative splicing at a genome-wide scale. The analysis of high-throughput alternative splicing is not yet standard and methodological developments are still needed. We propose a novel statistical approach - Dually Constrained Correspondence Analysis - for the detection of splicing changes in exon array data. Using this methodology, we investigated the genome-wide alteration of alternative splicing in patients with non-small cell lung cancer treated by bevacizumab/erlotinib. Splicing candidates reveal a series of genes related to carcinogenesis (SFTPB), cell adhesion (STAB2, PCDH15, HABP2), tumor aggressiveness (ARNTL2), apoptosis, proliferation and differentiation (PDE4D, FLT3, IL1R2), cell invasion (ETV1), as well as tumor growth (OLFM4, FGF14), tumor necrosis (AFF3) or tumor suppression (TUSC3, CSMD1, RHOBTB2, SERPINB5), with indication of known alternative splicing in a majority of genes. DCCA facilitates the identification of putative biologically relevant alternative splicing events in high-throughput exon array data.
    Journal of Biomedical Informatics 10/2015; DOI:10.1016/j.jbi.2015.10.002
  • [Show abstract] [Hide abstract]
    ABSTRACT: The rapidly increasing volume of clinical information captured in Electronic Health Records (EHRs) has led to the application of increasingly sophisticated models for purposes such as disease subtype discovery and predictive modeling. However, increasing adoption of EHRs implies that in the near future, much of the data available for such purposes will be from a time period during which both the practice of medicine and the clinical use of EHRs are in flux due to historic changes in both technology and incentives. In this work, we explore the implications of this phenomenon, called non-stationarity, on predictive modeling. We focus on the problem of predicting delayed wound healing using data available in the EHR during the first week of care in outpatient wound care centers, using a large dataset covering over 150,000 individual wounds and 59,958 patients seen over a period of four years. We manipulate the degree of non-stationarity seen by the model development process by changing the way data is split into training and test sets. We demonstrate that non-stationarity can lead to quite different conclusions regarding the relative merits of different models with respect to predictive power and calibration of their posterior probabilities. Under the non-stationarity exhibited in this dataset, the performance advantage of complex methods such as stacking relative to the best simple classifier disappears. Ignoring non-stationarity can thus lead to sub-optimal model selection in this task.
    Journal of Biomedical Informatics 10/2015; DOI:10.1016/j.jbi.2015.10.006
  • [Show abstract] [Hide abstract]
    ABSTRACT: Scientific text annotation has become an important task for biomedical scientists. Nowadays, there is an increasing need for the development of intelligent systems to support new scientific findings. Public databases available on the Web provide useful data, but much more useful information are still accessible in scientific texts. Text annotation relies on the use of ontologies to maintain annotations based on a uniform vocabulary. However, it is difficult to use an ontology, especially those that cover a large domain. In addition, since scientific texts explore multiple domains, which are covered by distinct ontologies, it becomes even more difficult to deal with such task. Moreover, there are dozens of ontologies in the biomedical area, and they are usually big in terms of the number of concepts. It is in this context that ontology modularization can be useful. This work presents an approach to annotate scientific documents using modules of different ontologies, which are built according to a module extraction technique. The main idea is to analyze a set of single-ontology annotations on a text to find out the user interests. Based on these annotations a set of modules are extracted from a set of distinct ontologies, and are made available for the user, for complementary annotation. The reduced size and focus of the extracted modules tend to facilitate the annotation task. An experiment was conducted to evaluate this approach, with the participation of a bioinformatician specialist of the Laboratory of Peptides and Proteins of the IOC/Fiocruz, who was interested in discovering new drug targets aiming at the combat of tropical diseases.
    Journal of Biomedical Informatics 10/2015; DOI:10.1016/j.jbi.2015.09.022
  • [Show abstract] [Hide abstract]
    ABSTRACT: Most clinical and biomedical data contain missing values. A patient's record may be split across multiple institutions, devices may fail, and sensors may not be worn at all times. While these missing values are often ignored, this can lead to bias and error when the data are mined. Further, the data are not simply missing at random. Instead the measurement of a variable such as blood glucose may depend on its prior values as well as that of other variables. These dependencies exist across time as well, but current methods have yet to incorporate these temporal relationships as well as multiple types of missingness. To address this, we propose an imputation method (FLk-NN) that incorporates time lagged correlations both within and across variables by combining two imputation methods, based on an extension to k-NN and the Fourier transform. This enables imputation of missing values even when all data at a time point is missing and when there are different types of missingness both within and across variables. In comparison to other approaches on three biological datasets (simulated and actual Type 1 diabetes datasets, and multi-modality neurological ICU monitoring) the proposed method has the highest imputation accuracy. This was true for up to half the data being missing and when consecutive missing values are a significant fraction of the overall time series length.
    Journal of Biomedical Informatics 10/2015; DOI:10.1016/j.jbi.2015.10.004
  • [Show abstract] [Hide abstract]
    ABSTRACT: We present the Unsupervised Phenome Model (UPhenome), a probabilistic graphical model for large-scale discovery of computational models of disease, or phenotypes. We tackle this challenge through the joint modeling of a large set of diseases and a large set of clinical observations. The observations are drawn directly from heterogeneous patient record data (notes, laboratory tests, medications, and diagnosis codes), and the diseases are modeled in an unsupervised fashion. We apply UPhenome to two qualitatively different mixtures of patients and diseases: records of extremely sick patients in the intensive care unit with constant monitoring, and records of outpatients regularly followed by care providers over multiple years. We demonstrate that the UPhenome model can learn from these different care settings, without any additional adaptation. Our experiments show that (i) the learned phenotypes combine the heterogeneous data types more coherently than baseline LDA-based phenotypes; (ii) they each represent single diseases rather than a mix of diseases more often than the baseline ones; and (iii) when applied to unseen patient records, they are correlated with the patients' ground-truth disorders. Code for training, inference, and quantitative evaluation is made available to the research community.
    Journal of Biomedical Informatics 10/2015; 58. DOI:10.1016/j.jbi.2015.10.001
  • [Show abstract] [Hide abstract]
    ABSTRACT: Purpose: To date the standard nosology and prognostic schemes for myeloid neoplasms have been based on morphologic and cytogenetic criteria. We sought to test the hypothesis that a comprehensive, unbiased analysis of somatic mutations may allow for an improved classification of these diseases to predict outcome (overall survival). Experimental design: We performed whole-exome sequencing (WES) of 274 myeloid neoplasms, including myelodysplastic syndrome (MDS, N=75), myelodysplastic/myeloproliferative neoplasia (MDS/MPN, N=33), and acute myeloid leukemia (AML, N=22), augmenting the resulting mutational data with public WES results from AML (N=144). We fit random survival forests (RSFs) to the patient survival and clinical/cytogenetic data, with and without gene mutation information, to build prognostic classifiers. A targeted sequencing assay was used to sequence predictor genes in an independent cohort of 507 patients, whose accompanying data were used to evaluate performance of the risk classifiers. Results: We show that gene mutations modify the impact of standard clinical variables on patient outcome, and therefore their incorporation hones the accuracy of prediction. The mutation-based classification scheme robustly predicted patient outcome in the validation set (log rank P=6.77×10(-21);poor prognosis vs.good prognosis categories HR 10.4, 95% CI 3.21-33.6). The RSF-based approach also compares favorably with recently-published efforts to incorporate mutational information for MDS prognosis. Conclusion: The results presented here support the inclusion of mutational information in prognostic classification of myeloid malignancies. Our classification scheme is implemented in a publicly available web-based tool (http://myeloid-risk. Case: edu/).
    Journal of Biomedical Informatics 10/2015; 58. DOI:10.1016/j.jbi.2015.10.003
  • [Show abstract] [Hide abstract]
    ABSTRACT: Predicting Anatomical Therapeutic Chemical (ATC) code of drugs is of vital importance for drug classification and repositioning. Discovering new association information related to drugs and ATC codes is still difficult for this topic. We propose a novel method named drug-domain hybrid (dD-Hybrid) incorporating drug-domain interaction network information into prediction models to predict drug's ATC codes. It is based on the assumption that drugs interacting with the same domain tend to share therapeutic effects. The results demonstrated dD-Hybrid has comparable performance to other methods on the gold standard dataset. Further, several new predicted drug-ATC pairs have been verified by experiments, which offer a novel way to utilize drugs for new purposes effectively.
    Journal of Biomedical Informatics 10/2015; 58. DOI:10.1016/j.jbi.2015.09.016