Journal of Biomedical Informatics (J Biomed Informat)

Publisher: Elsevier

Journal description

The Journal of Biomedical Informatics (formerly Computers and Biomedical Research) has been redesigned to reflect a commitment to high-quality original research papers and reviews in the area of biomedical informatics. Although published articles are motivated by applications in the biomedical sciences (for example, clinical medicine, health care, population health, imaging, and bioinformatics), the journal emphasizes reports of new methodologies and techniques that have general applicability and that form the basis for the evolving science of biomedical informatics. Articles on medical devices, and formal evaluations of completed systems, including clinical trials of information technologies, would generally be more suitable for publication in other venues. System descriptions are welcome if they illustrate and substantiate the underlying methodology that is the principal focus of the report.

Current impact factor: 2.48

Impact Factor Rankings

2015 Impact Factor Available summer 2015
2013 / 2014 Impact Factor 2.482
2012 Impact Factor 2.131
2011 Impact Factor 1.792
2010 Impact Factor 1.719
2009 Impact Factor 2.432
2008 Impact Factor 1.924
2007 Impact Factor 2
2006 Impact Factor 2.346
2005 Impact Factor 2.388
2004 Impact Factor 1.013
2003 Impact Factor 0.855
2002 Impact Factor 0.862

Impact factor over time

Impact factor

Additional details

5-year impact 2.43
Cited half-life 4.40
Immediacy index 0.55
Eigenfactor 0.01
Article influence 0.84
Website Journal of Biomedical Informatics website
Other titles Journal of biomedical informatics (Online)
ISSN 1532-0480
OCLC 45147742
Material type Document, Periodical, Internet resource
Document type Internet Resource, Computer File, Journal / Magazine / Newspaper

Publisher details


  • Pre-print
    • Author can archive a pre-print version
  • Post-print
    • Author can archive a post-print version
  • Conditions
    • Pre-print allowed on any website or open access repository
    • Voluntary deposit by author of authors post-print allowed on authors' personal website, or institutions open scholarly website including Institutional Repository, without embargo, where there is not a policy or mandate
    • Deposit due to Funding Body, Institutional and Governmental policy or mandate only allowed where separate agreement between repository and the publisher exists.
    • Permitted deposit due to Funding Body, Institutional and Governmental policy or mandate, may be required to comply with embargo periods of 12 months to 48 months .
    • Set statement to accompany deposit
    • Published source must be acknowledged
    • Must link to journal home page or articles' DOI
    • Publisher's version/PDF cannot be used
    • Articles in some journals can be made Open Access on payment of additional charge
    • NIH Authors articles will be submitted to PubMed Central after 12 months
    • Publisher last contacted on 18/10/2013
  • Classification
    ​ green

Publications in this journal

  • [Show abstract] [Hide abstract]
    ABSTRACT: This study examines the ability of nonclinical adverse event observations to predict human clinical adverse events observed in drug development programs. In addition it examines the relationship between nonclinical and clinical adverse event observations to drug withdrawal and proposes a model to predict drug withdrawal based on these observations. These analyses provide risk assessments useful for both planning patient safety programs, as well as a statistical framework for assessing the future success of drug programs based on nonclinical and clinical observations. Bayesian analyses were undertaken to investigate the connection between nonclinical adverse event observations and observations of that same event in clinical trial for a large set of approved drugs. We employed the same statistical methods used to evaluate the efficacy of diagnostic tests to evaluate the ability of nonclinical studies to predict adverse events in clinical studies, and adverse events in both to predict drug withdrawal. We find that some nonclinical observations suggest higher risk for observing the same adverse event in clinical studies, particularly arrhythmias, QT prolongation, and abnormal hepatic function. However the lack of these events in nonclinical studies is found to not be a good predictor of safety in humans. Some nonclinical and clinical observations appear to be associated with high risk of drug withdrawal from market, especially arrhythmia and hepatic necrosis. We use the method to estimate the overall risk of drug withdrawal from market using the product of the risks from each nonclinical and clinical observation to create a risk profile.
    Journal of Biomedical Informatics 06/2015; 54. DOI:10.1016/j.jbi.2015.02.008
  • Jenna Marie Reps, Jonathan M Garibaldi, Uwe Aickelin, Jack E Gibson, Richard B Hubbard
    [Show abstract] [Hide abstract]
    ABSTRACT: Big longitudinal observational medical data potentially hold a wealth of information and have been recognised as potential sources for gaining new drug safety knowledge. Unfortunately there are many complexities and underlying issues when analysing longitudinal observational data. Due to these complexities, existing methods for large-scale detection of negative side effects using observational data all tend to have issues distinguishing between association and causality. New methods that can better discriminate causal and non-causal relationships need to be developed to fully utilise the data. In this paper we propose using a set of causality considerations developed by the epidemiologist Bradford Hill as a basis for engineering features that enable the application of supervised learning for the problem of detecting negative side effects. The Bradford Hill considerations look at various perspectives of a drug and outcome relationship to determine whether it shows causal traits. We taught a classifier to find patterns within these perspectives and it learned to discriminate between association and causality. The novelty of this research is the combination of supervised learning and Bradford Hill's causality considerations to automate the Bradford Hill's causality assessment. We evaluated the framework on a drug safety gold standard know as the observational medical outcomes partnership's non-specified association reference set. The methodology obtained excellent discriminate ability with area under the curves ranging between 0.792-0.940 (existing method optimal: 0.73) and a mean average precision of 0.640 (existing method optimal: 0.141). The proposed features can be calculated efficiently and be readily updated, making the framework suitable for big observational data. Copyright © 2015. Published by Elsevier Inc.
    Journal of Biomedical Informatics 06/2015; DOI:10.1016/j.jbi.2015.06.011
  • [Show abstract] [Hide abstract]
    ABSTRACT: Microarray platforms enable the investigation of allelic variants that may be correlated to phenotypes. Among those, the Affymetrix DMET (Drug Metabolism Enzymes and Transporters) platform enables the simultaneous investigation of all the genes that are related to drug absorption, distribution, metabolism and excretion (ADME). Although recent studies demonstrated the effectiveness of the use of DMET data for studying drug response or toxicity in clinical studies, there is a lack of tools for the automatic analysis of DMET data. In a previous work we developed DMET-Analyzer, a methodology and a supporting platform able to automatize the statistical study of allelic variants, that has been validated in several clinical studies. Although DMET-Analyzer is able to correlate a single variant for each probe (related to a portion of a gene) through the use of the Fisher test, it is unable to discover multiple associations among allelic variants, due to its underlying statistic analysis strategy that focuses on a single variant for each time. To overcome those limitations, here we propose a new analysis methodology for DMET data based on Association Rules mining, and an efficient implementation of this methodology, named DMET-Miner. DMET-Miner extends the DMET-Analyzer tool with data mining capabilities and correlates the presence of a set of allelic variants with the conditions of patient's samples by exploiting association rules. To face the high number of frequent itemsets generated when considering large clinical studies based on DMET data, DMET-Miner uses an efficient data structure and implements an optimized search strategy that reduces the search space and the execution time. Preliminary experiments on synthetic DMET datasets, show how DMET-Miner outperforms off-the-shelf data mining suites such as the FP-Growth algorithms available in Weka and RapidMiner. To demonstrate the biological relevance of the extracted association rules and the effectiveness of the proposed approach from a medical point of view, some preliminary studies on a real clinical dataset are currently under medical investigation. Copyright © 2015 Elsevier Inc. All rights reserved.
    Journal of Biomedical Informatics 06/2015; DOI:10.1016/j.jbi.2015.06.005
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Self-monitoring is an integral component of many chronic diseases; however few theoretical frameworks address how individuals understand self-monitoring data and use it to guide self-management. To articulate a theoretical framework of sensemaking in diabetes self-management that integrates existing scholarship with empirical data. The proposed framework is grounded in theories of sensemaking adopted from organizational behavior, education, and human-computer interaction. To empirically validate the framework the researchers reviewed and analyzed reports on qualitative studies of diabetes self-management practices published in peer-reviewed journals from 2000 to 2015. The proposed framework distinguishes between sensemaking and habitual modes of self-management and identifies three essential sensemaking activities: perception of new information related to health and wellness, development of inferences that inform selection of actions, and carrying out daily activities in response to new information. The analysis of qualitative findings from 50 published reports provided ample empirical evidence for the proposed framework; however, it also identified a number of barriers to engaging in sensemaking in diabetes self-management. The proposed framework suggests new directions for research in diabetes self-management and for design of new informatics interventions for data-driven self-management. Copyright © 2015. Published by Elsevier Inc.
    Journal of Biomedical Informatics 06/2015; 288. DOI:10.1016/j.jbi.2015.06.006
  • [Show abstract] [Hide abstract]
    ABSTRACT: Assessment of medical trainee learning through pre-defined competencies is now commonplace in schools of medicine. We describe a novel electronic advisor system using natural language processing (NLP) to identify two geriatric medicine competencies from medical student clinical notes in the electronic medical record: advance directives (AD) and altered mental status (AMS). Clinical notes from third year medical students were processed using a general-purpose NLP system to identify biomedical concepts and their section context. The system analyzed these notes for relevance to AD or AMS and generated custom email alerts to students with embedded supplemental learning material customized to their notes. Recall and precision of the two advisors were evaluated by physician review. Students were given pre and post multiple choice question tests broadly covering geriatrics. Of 102 students approached, 66 students consented and enrolled. The system sent 393 email alerts to 54 students (82%), including 270 for AD and 123 for AMS. Precision was 100% for AD and 93% for AMS. Recall was 69% for AD and 100% for AMS. Students mentioned ADs for 43 patients, with all mentions occurring after first having received an AD reminder. Students accessed educational links 34 times from the 393 email alerts. There was no difference in pre (mean 62%) and post (mean 60%) test scores. The system effectively identified two educational opportunities using NLP applied to clinical notes and demonstrated a small change in student behavior. Use of electronic advisors such as these may provide a scalable model to assess specific competency elements and deliver educational opportunities. Copyright © 2015. Published by Elsevier Inc.
    Journal of Biomedical Informatics 06/2015; DOI:10.1016/j.jbi.2015.06.004
  • [Show abstract] [Hide abstract]
    ABSTRACT: Investments of resources to purposively improve the movement of information between health system providers are currently made with imperfect information. No inventories of system-level electronic health information flows currently exist, nor do measures of inter-organizational electronic information exchange. Using Protégé 4, an open-source OWL Web ontology language editor and knowledge-based framework, we formalized a model that decomposes inter-organizational electronic health information flow into derivative concepts such as diversity, breadth, volume, structure, standardization and connectivity. The ontology was populated with data from a regional health system and the flows were measured. Individual instance's properties were inferred from their class associations as determined by their data and object property rules. It was also possible to visualize interoperability activity for regional analysis and planning purposes. A property called Impact was created from the total number of patients or clients that a health entity in the region served in a year, and the total number of health service providers or organizations with whom it exchanged information in support of clinical decision-making, diagnosis or treatment. Identifying providers with a high Impact but low Interoperability score could assist planners and policy-makers to optimize technology investments intended to electronically share patient information across the continuum of care. Finally, we demonstrated how linked ontologies were used to identify logical inconsistencies in self-reported data for the study. Copyright © 2015 Elsevier Inc. All rights reserved.
    Journal of Biomedical Informatics 06/2015; DOI:10.1016/j.jbi.2015.05.020
  • [Show abstract] [Hide abstract]
    ABSTRACT: The importance of searching BioMedical literature for drug interaction and side-effects is apparent. Current digital libraries (e.g., PubMed) suffer infrequent tagging and metadata annotation updates. Such limitations cause absence of linking literature to new scientific evidence. This demonstrates a great deal of challenges that stand in the way of scientists when searching biomedical repositories. In this paper, we present a network mining approach that provides a bridge for linking and searching drug-related literature. Our contributions here are two fold: (1) an efficient algorithm called HashPairMiner to address the run-time complexity issues demonstrated in its predecessor algorithm: HashnetMiner, and (2) a database of discoveries hosted on the web to facilitate literature search using the results produced by HashPairMiner. Though the K-H Network model and the HashPairMiner algorithm are fairly young, their outcome is evidence of the considerable promise they offer to the biomedical science community in general and the drug research community in particular. Full text available: Copyright © 2015. Published by Elsevier Inc.
    Journal of Biomedical Informatics 06/2015; 56(SI Pharmacovigilance):157–168. DOI:10.1016/j.jbi.2015.05.015
  • [Show abstract] [Hide abstract]
    ABSTRACT: As healthcare shifts from the hospital to the home, it is becoming increasingly important to understand how patients interact with home medical devices, to inform the safe and patient-friendly design of these devices. Distributed Cognition (DCog) has been a useful theoretical framework for understanding situated interactions in the healthcare domain. However, it has not previously been applied to study interactions with home medical devices. In this study, DCog was applied to understand renal patients' interactions with Home Hemodialysis Technology (HHT), as an example of a home medical device. Data was gathered through ethnographic observations and interviews with 19 renal patients and interviews with seven professionals. Data was analyzed through the principles summarized in the Distributed Cognition for Teamwork methodology. In this paper we focus on the analysis of system activities, information flows, social structures, physical layouts, and artefacts. By explicitly considering different ways in which cognitive processes are distributed, the DCog approach helped to understand patients' interaction strategies, and pointed to design opportunities that could improve patients' experiences of using HHT. The findings highlight the need to design HHT taking into consideration likely scenarios of use in the home and of the broader home context. A setting such as home hemodialysis has the characteristics of a complex and safety-critical socio-technical system, and a DCog approach effectively helps to understand how safety is achieved or compromised in such a system. Copyright © 2015. Published by Elsevier Inc.
    Journal of Biomedical Informatics 06/2015; 13. DOI:10.1016/j.jbi.2015.06.002
  • [Show abstract] [Hide abstract]
    ABSTRACT: Many in-hospital mortality risk prediction scores dichotomize predictive variables to simplify the score calculation. However, hard thresholding in these additive stepwise scores of the form "add x points if variable v is above/below threshold t" may lead to critical failures. In this paper, we seek to develop risk prediction scores that preserve clinical knowledge embedded in features and structure of the existing additive stepwise scores while addressing limitations caused by variable dichotomization. To this end, we propose a novel score structure that relies on a transformation of predictive variables by means of nonlinear logistic functions facilitating smooth differentiation between critical and normal values of the variables. We develop an optimization framework for inferring parameters of the logistic functions for a given patient population via cyclic block coordinate descent. The parameters may readily be updated as the patient population and standards of care evolve. We tested the proposed methodology on two populations: (1) brain trauma patients admitted to the intensive care unit of the Dell Children's Medical Center of Central Texas between 2007 and 2012, and (2) adult ICU patient data from the MIMIC II database. The results are compared with those obtained by the widely used PRISM III and SOFA scores. The prediction power of a score is evaluated using area under ROC curve, Youden's index, and precision-recall balance in a cross-validation study. The results demonstrate that the new framework enables significant performance improvements over PRISM III and SOFA in terms of all three criteria. Copyright © 2015. Published by Elsevier Inc.
    Journal of Biomedical Informatics 06/2015; DOI:10.1016/j.jbi.2015.05.021
  • [Show abstract] [Hide abstract]
    ABSTRACT: Recently, the rapid advance in genome sequencing technology has led to production of huge amount of sensitive genomic data. However, a serious privacy challenge is confronted with increasing number of genetic tests as genomic data is the ultimate source of identity for humans. Lately, privacy threats and possible solutions regarding the undesired access to genomic data are discussed, however it is challenging to apply proposed solutions to real life problems due to the complex nature of security definitions. In this review, we have categorized pre-existing problems and corresponding solutions in more understandable and convenient way. Additionally, we have also included open privacy problems coming with each genomic data processing procedure. We believe our classification of genome associated privacy problems will pave the way for linking of real-life problems with previously proposed methods. Copyright © 2015. Published by Elsevier Inc.
    Journal of Biomedical Informatics 06/2015; DOI:10.1016/j.jbi.2015.05.022
  • [Show abstract] [Hide abstract]
    ABSTRACT: Evaluation of survival models to predict cancer patient prognosis is one of the most important areas of emphasis in cancer research. A binary classification approach has difficulty directly predicting survival due to the characteristics of censored observations and the fact that the predictive power depends on the threshold used to set two classes. In contrast, the traditional Cox regression approach has some drawbacks in the sense that it does not allow for the identification of interactions between genomic features, which could have key roles associated with cancer prognosis. In addition, data integration is regarded as one of the important issues in improving the predictive power of survival models since cancer could be caused by multiple alterations through meta-dimensional genomic data including genome, epigenome, transcriptome, and proteome. Here we have proposed a new integrative framework designed to perform these three functions simultaneously: (1) predicting censored survival data; (2) integrating meta-dimensional omics data; (3) identifying interactions within/between meta-dimensional genomic features associated with survival. In order to predict censored survival time, martingale residuals were calculated as a new continuous outcome and a new fitness function used by the grammatical evolution neural network (GENN) based on mean absolute difference of martingale residuals was implemented. To test the utility of the proposed framework, a simulation study was conducted, followed by an analysis of meta-dimensional omics data including copy number, gene expression, DNA methylation, and protein expression data in breast cancer retrieved from The Cancer Genome Atlas (TCGA). On the basis of the results from breast cancer dataset, we were able to identify interactions not only within a single dimension of genomic data but also between meta-dimensional omics data that are associated with survival. Notably, the predictive power of our best meta-dimensional model was 73% which outperformed all of the other models conducted based on a single dimension of genomic data. Breast cancer is an extremely heterogeneous disease and the high levels of genomic diversity within/between breast tumors could affect the risk of therapeutic responses and disease progression. Thus, identifying interactions within/between meta-dimensional omics data associated with survival in breast cancer is expected to deliver direction for improved meta-dimensional prognostic biomarkers and therapeutic targets. Copyright © 2015. Published by Elsevier Inc.
    Journal of Biomedical Informatics 06/2015; DOI:10.1016/j.jbi.2015.05.019
  • [Show abstract] [Hide abstract]
    ABSTRACT: Risk sharing arrangements between hospitals and payers together with penalties imposed by the Centers for Medicare and Medicaid (CMS) are driving an interest in decreasing early readmissions. There are a number of published risk models predicting 30 day readmissions for particular patient populations, however they often exhibit poor predictive performance and would be unsuitable for use in a clinical setting. In this work we describe and compare several predictive models, some of which have never been applied to this task and which outperform the regression methods that are typically applied in the healthcare literature. In addition, we apply methods from deep learning to the five conditions CMS is using to penalize hospitals, and offer a simple framework for determining which conditions are most cost effective to target. Copyright © 2015. Published by Elsevier Inc.
    Journal of Biomedical Informatics 06/2015; DOI:10.1016/j.jbi.2015.05.016
  • [Show abstract] [Hide abstract]
    ABSTRACT: Patient recruitment is one of the most important barriers to successful completion of clinical trials and thus to obtaining evidence about new methods for prevention, diagnostics and treatment. The reason is that recruitment is effort consuming. It requires the identification of candidate patients for the trial (the population under study), and verifying for each patient whether the eligibility criteria are met. The work we describe in this paper aims to support the comparison of population under study in different trials, and the design of eligibility criteria for new trials. We do this by introducing structured eligibility criteria, that enhance reuse of criteria across trials. We developed a method that allows for automated structuring of criteria from text. Additionally, structured eligibility criteria allow us to propose suggestions for relaxation of criteria to remove potentially unnecessarily restrictive conditions. We thereby increase the recruitment potential and generalizability of a trial.
    Journal of Biomedical Informatics 05/2015; DOI:10.1016/j.jbi.2015.05.005
  • [Show abstract] [Hide abstract]
    ABSTRACT: Predicting malignancy of solitary pulmonary nodules from computer tomography scans is a difficult and important problem in the diagnosis of lung cancer. This paper investigates the contribution of nodule characteristics in the prediction of malignancy. Using data from Lung Image Database Consortium (LIDC) database, we propose a weighted rule based classification approach for predicting malignancy of pulmonary nodules. LIDC database contains CT scans of nodules and information about nodule characteristics evaluated by multiple annotators. In the first step of our method, votes for nodule characteristics are obtained from ensemble classifiers by using image features. In the second step, votes and rules obtained from radiologist evaluations are used by a weighted rule based method to predict malignancy. The rule based method is contructed by using radiologist evaluations on previous cases. Correlations between malignancy and other nodule characteristics and agreement ratio of radiologists are considered in rule evaluation. To handle the unbalanced nature of LIDC, ensemble classifiers and data balancing methods are used. The proposed approach is compared with the classification methods trained on image features. Classification accuracy, specificity and sensitivity of classifiers are measured. The experimental results show that using nodule characteristics for malignancy prediction can improve classification results. Copyright © 2015. Published by Elsevier Inc.
    Journal of Biomedical Informatics 05/2015; 56. DOI:10.1016/j.jbi.2015.05.011
  • [Show abstract] [Hide abstract]
    ABSTRACT: The 2014 i2b2/UTHealth natural language processing shared task featured a track focused on identifying risk factors for heart disease (specifically, Cardiac Artery Disease) in clinical narratives. For this track, we used a "light" annotation paradigm to annotate a set of 1,304 longitudinal medical records describing 296 patients for risk factors and the times they were present. We designed the annotation task for this track with the goal of balancing annotation load and time with quality, so as to generate a gold standard corpus that can benefit a clinically-relevant task. We applied light annotation procedures and determined the gold standard using majority voting. On average, the agreement of annotators with the gold standard was above 0.95, indicating high reliability. The resulting document-level annotations generated for each record in each longitudinal EMR in this corpus provide information that can support studies of progression of heart disease risk factors in the included patients over time. These annotations were used in the Risk Factor track of the 2014 i2b2/UTHealth shared task. Participating systems achieved a mean micro-averaged F1 measure of 0.815 and a maximum F1 measure of 0.928 for identifying these risk factors in patient records. Copyright © 2015. Published by Elsevier Inc.
    Journal of Biomedical Informatics 05/2015; DOI:10.1016/j.jbi.2015.05.009