Journal of Biomedical Informatics (J Biomed Informat)

Publisher: Elsevier

Journal description

The Journal of Biomedical Informatics (formerly Computers and Biomedical Research) has been redesigned to reflect a commitment to high-quality original research papers and reviews in the area of biomedical informatics. Although published articles are motivated by applications in the biomedical sciences (for example, clinical medicine, health care, population health, imaging, and bioinformatics), the journal emphasizes reports of new methodologies and techniques that have general applicability and that form the basis for the evolving science of biomedical informatics. Articles on medical devices, and formal evaluations of completed systems, including clinical trials of information technologies, would generally be more suitable for publication in other venues. System descriptions are welcome if they illustrate and substantiate the underlying methodology that is the principal focus of the report.

Current impact factor: 2.13

Impact Factor Rankings

2015 Impact Factor Available summer 2015
2011 Impact Factor 1.792

Additional details

5-year impact 2.43
Cited half-life 4.40
Immediacy index 0.55
Eigenfactor 0.01
Article influence 0.84
Website Journal of Biomedical Informatics website
Other titles Journal of biomedical informatics (Online)
ISSN 1532-0480
OCLC 45147742
Material type Document, Periodical, Internet resource
Document type Internet Resource, Computer File, Journal / Magazine / Newspaper

Publisher details


  • Pre-print
    • Author can archive a pre-print version
  • Post-print
    • Author can archive a post-print version
  • Conditions
    • Pre-print allowed on any website or open access repository
    • Voluntary deposit by author of authors post-print allowed on authors' personal website, or institutions open scholarly website including Institutional Repository, without embargo, where there is not a policy or mandate
    • Deposit due to Funding Body, Institutional and Governmental policy or mandate only allowed where separate agreement between repository and the publisher exists.
    • Permitted deposit due to Funding Body, Institutional and Governmental policy or mandate, may be required to comply with embargo periods of 12 months to 48 months .
    • Set statement to accompany deposit
    • Published source must be acknowledged
    • Must link to journal home page or articles' DOI
    • Publisher's version/PDF cannot be used
    • Articles in some journals can be made Open Access on payment of additional charge
    • NIH Authors articles will be submitted to PubMed Central after 12 months
    • Publisher last contacted on 18/10/2013
  • Classification
    ​ green

Publications in this journal

  • [Show abstract] [Hide abstract]
    ABSTRACT: This study examines the ability of nonclinical adverse event observations to predict human clinical adverse events observed in drug development programs. In addition it examines the relationship between nonclinical and clinical adverse event observations to drug withdrawal and proposes a model to predict drug withdrawal based on these observations. These analyses provide risk assessments useful for both planning patient safety programs, as well as a statistical framework for assessing the future success of drug programs based on nonclinical and clinical observations. Bayesian analyses were undertaken to investigate the connection between nonclinical adverse event observations and observations of that same event in clinical trial for a large set of approved drugs. We employed the same statistical methods used to evaluate the efficacy of diagnostic tests to evaluate the ability of nonclinical studies to predict adverse events in clinical studies, and adverse events in both to predict drug withdrawal. We find that some nonclinical observations suggest higher risk for observing the same adverse event in clinical studies, particularly arrhythmias, QT prolongation, and abnormal hepatic function. However the lack of these events in nonclinical studies is found to not be a good predictor of safety in humans. Some nonclinical and clinical observations appear to be associated with high risk of drug withdrawal from market, especially arrhythmia and hepatic necrosis. We use the method to estimate the overall risk of drug withdrawal from market using the product of the risks from each nonclinical and clinical observation to create a risk profile.
    Journal of Biomedical Informatics 06/2015; DOI:10.1016/j.jbi.2015.02.008
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Identifying unknown drug interactions is of great benefit in the early detection of adverse drug reactions. Despite existence of several resources for drug-drug interaction (DDI) information, the wealth of such information is buried in a body of unstructured medical text which is growing exponentially. This calls for developing text mining techniques for identifying DDIs. The state-of-the-art DDI extraction methods use Support Vector Machines (SVMs) with non-linear composite kernels to explore diverse contexts in literature. While computationally less expensive, linear kernel-based systems have not achieved a comparable performance in DDI extraction tasks. In this work, we propose an efficient and scalable system using a linear kernel to identify DDI information. The proposed approach consists of two steps: identifying DDIs and assigning one of four different DDI types to the predicted drug pairs. We demonstrate that when equipped with a rich set of lexical and syntactic features, a linear SVM classifier is able to achieve a competitive performance in detecting DDIs. In addition, the one-against-one strategy proves vital for addressing an imbalance issue in DDI type classification. Applied to the DDIExtraction 2013 corpus, our system achieves an F1 score of 0.670, as compared to 0.651 and 0.609 reported by the top two participating teams in the DDIExtraction 2013 challenge, both based on non-linear kernel methods. Copyright © 2015. Published by Elsevier Inc.
    Journal of Biomedical Informatics 03/2015; 39. DOI:10.1016/j.jbi.2015.03.002
  • [Show abstract] [Hide abstract]
    ABSTRACT: In Electronic Health Records (EHRs), much of valuable information regarding patients' conditions is embedded in free text format. Natural language processing (NLP) techniques have been developed to extract clinical information from free text. One challenge faced in clinical NLP is that the meaning of clinical entities is heavily affected by modifiers such as negation. A negation detection algorithm, NegEx, applies a simplistic approach that has been shown to be powerful in clinical NLP. However, due to the failure to consider the contextual relationship between words within a sentence, NegEx fails to correctly capture the negation status of concepts in complex sentences. Incorrect negation assignment could cause inaccurate diagnosis of patients' condition or contaminated study cohorts. We developed a negation algorithm called DEEPEN to decrease NegEx's false positives by taking into account the dependency relationship between negation words and concepts within a sentence using Stanford dependency parser. The system was developed and tested using EHR data from Indiana University (IU) and it was further evaluated on Mayo Clinic dataset to assess its generalizability. The evaluation results demonstrate DEEPEN, which incorporates dependency parsing into NegEx, can reduce the number of incorrect negation assignment for patients with positive findings, and therefore improve the identification of patients with the target clinical findings in EHRs. Copyright © 2015 Elsevier Inc. All rights reserved.
    Journal of Biomedical Informatics 03/2015; DOI:10.1016/j.jbi.2015.02.010
  • [Show abstract] [Hide abstract]
    ABSTRACT: Complex clinical decisions require the decision maker to evaluate multiple factors that may interact with each other. Many clinical studies, however, report 'univariate' relations between a single factor and outcome. Such univariate statistics are often insufficient to provide useful support for complex clinical decisions even when they are pooled using meta-analysis. More useful decision support could be provided by evidence-based models that take the interaction between factors into account. In this paper, we propose a method of integrating the univariate results of a meta-analysis with a clinical dataset and expert knowledge to construct multivariate Bayesian network (BN) models. The technique reduces the size of the dataset needed to learn the parameters of a model of a given complexity. Supplementing the data with the meta-analysis results avoids the need to either simplify the model - ignoring some complexities of the problem - or to gather more data. The method is illustrated by a clinical case study into the prediction of the viability of severely injured lower extremities. The case study illustrates the advantages of integrating combined evidence into BN development: the BN developed using our method outperformed four different data-driven structure learning methods, and a well-known scoring model (MESS) in this domain.
    Journal of Biomedical Informatics 08/2014; DOI:10.1016/j.jbi.2014.07.018
  • [Show abstract] [Hide abstract]
    ABSTRACT: Insights about patterns of system use are often gained through the analysis of system log files, which record the actual behavior of users. In a clinical context, however, few attempts have been made to typify system use through log file analysis. The present study offers a framework for identifying, describing, and discerning among patterns of use of a clinical information retrieval system. We use the session attributes of volume, diversity, granularity, duration, and content to define a multidimensional space in which each specific session can be positioned. We also describe an analytical method for identifying the common archetypes of system use in this multidimensional space. We demonstrate the value of the proposed framework with a log file of the use of a health information exchange (HIE) system by physicians in an emergency department (ED) of a large Israeli hospital. The analysis reveals five distinct patterns of system use, which have yet to be described in the relevant literature. The results of this study have the potential to inform the design of HIE systems for efficient and effective use, thus increasing their contribution to the clinical decision-making process.
    Journal of Biomedical Informatics 07/2014; DOI:10.1016/j.jbi.2014.07.003
  • [Show abstract] [Hide abstract]
    ABSTRACT: This work proposes a histology image indexing strategy based on multimodal representations obtained from the combination of visual features and associated semantic annotations. Both data modalities are complementary information sources for an image retrieval system, since visual features lack explicit semantic information and semantic terms do not usually describe the visual appearance of images. The paper proposes a novel strategy to build a fused image representation using matrix factorization algorithms and data reconstruction principles to generate a set of multimodal features. The methodology can seamlessly recover the multimodal representation of images without semantic annotations, allowing us to index new images using visual features only, and also accepting single example images as queries. Experimental evaluations on three different histology image data sets show that our strategy is a simple, yet effective approach to building multimodal representations for histology image search, and outperforms the response of the popular late fusion approach to combine information.
    Journal of Biomedical Informatics 05/2014; 51. DOI:10.1016/j.jbi.2014.04.016
  • [Show abstract] [Hide abstract]
    ABSTRACT: Cost-benefit analysis is a prerequisite for making good business decisions. In the business environment, companies intend to make profit from maximizing information utility of published data while having an obligation to protect individual privacy. In this paper, we quantify the trade-off between privacy and data utility in health data publishing in terms of monetary value. We propose an analytical cost model that can help health information custodians (HICs) make better decisions about sharing person-specific health data with other parties. We examine relevant cost factors associated with the value of anonymized data and the possible damage cost due to potential privacy breaches. Our model guides an HIC to find the optimal value of publishing health data and could be utilized for both perturbative and non-perturbative anonymization techniques. We show that our approach can identify the optimal value for different privacy models, including K-anonymity, LKC-privacy, and ∊-differential privacy, under various anonymization algorithms and privacy parameters through extensive experiments on real-life data.
    Journal of Biomedical Informatics 04/2014; 50. DOI:10.1016/j.jbi.2014.04.012
  • [Show abstract] [Hide abstract]
    ABSTRACT: Most of the information in Electronic Health Records (EHRs) is represented in free textual form. Practitioners searching EHRs need to phrase their queries carefully, as the record might use synonyms or other related words. In this paper we show that an automatic query expansion method based on the Unified Medicine Language System (UMLS) Metathesaurus improves the results of a robust baseline when searching EHRs. The method uses a graph representation of the lexical units, concepts and relations in the UMLS Metathesaurus. It is based on random walks over the graph, which start on the query terms. Random walks are a well-studied discipline in both Web and Knowledge Base datasets. Our experiments over the TREC Medical Record track show improvements in both the 2011 and 2012 datasets over a strong baseline. Our analysis shows that the success of our method is due to the automatic expansion of the query with extra terms, even when they are not directly related in the UMLS Metathesaurus. The terms added in the expansion go beyond simple synonyms, and also add other kinds of topically related terms. Expansion of queries using related terms in the UMLS Metathesaurus beyond synonymy is an effective way to overcome the gap between query and document vocabularies when searching for patient cohorts.
    Journal of Biomedical Informatics 04/2014; 51. DOI:10.1016/j.jbi.2014.04.013
  • [Show abstract] [Hide abstract]
    ABSTRACT: Medical documentation is a time-consuming task and there is a growing number of documentation requirements. In order to improve documentation, harmonization and standardization based on existing forms and medical concepts are needed. Systematic analysis of forms can contribute to standardization building upon new methods for automated comparison of forms. Objectives of this research are quantification and comparison of data elements for breast and prostate cancer to discover similarities, differences and reuse potential between documentation sets. In addition, common data elements for each entity should be identified by automated comparison of forms. A collection of 57 forms regarding prostate and breast cancer from quality management, registries, clinical documentation of two university hospitals (Erlangen, Münster), research datasets, certification requirements and trial documentation were transformed into the Operational Data Model (ODM). These ODM-files were semantically enriched with concept codes and analyzed with the compareODM algorithm. Comparison results were aggregated and lists of common concepts were generated. Grid images, dendrograms and spider charts were used for illustration. Overall, 1008 data elements for prostate cancer and 1232 data elements for breast cancer were analyzed. Average routine documentation consists of 390 data elements per disease entity and site. Comparisons of forms identified up to 20 comparable data elements in cancer conference forms from both hospitals. Urology forms contain up to 53 comparable data elements with quality management and up to 21 with registry forms. Urology documentation of both hospitals contains up to 34 comparable items with international common data elements. Clinical documentation sets share up to 24 comparable data elements with trial documentation. Within clinical documentation administrative items are most common comparable items. Selected common medical concepts are contained in up to 16 forms. The amount of documentation for cancer patients is enormous. There is an urgent need for standardized structured single source documentation. Semantic annotation is time-consuming, but enables automated comparison between different form types, hospital sites and even languages. This approach can help to identify common data elements in medical documentation. Standardization of forms and building up forms on the basis of coding systems is desirable. Several comparable data elements within the analyzed forms demonstrate the harmonization potential, which would enable better data reuse. Identifying common data elements in medical forms from different settings with systematic and automated form comparison is feasible.
    Journal of Biomedical Informatics 04/2014; 51. DOI:10.1016/j.jbi.2014.04.008
  • [Show abstract] [Hide abstract]
    ABSTRACT: Poor device design that fails to adequately account for user needs, cognition, and behavior is often responsible for use errors resulting in adverse events. This poor device design is also often latent, and could be responsible for "No Fault Found" (NFF) reporting, in which medical devices sent for repair by clinical users are found to be operating as intended. Unresolved NFF reports may contribute to incident under reporting, clinical user frustration, and biomedical engineering technologist inefficacy. This study uses human factors engineering methods to investigate the relationship between NFF reporting frequency and device usability. An analysis of medical equipment maintenance data was conducted to identify devices with a high NFF reporting frequency. Subsequently, semi-structured interviews and heuristic evaluations were performed in order to identify potential usability issues. Finally, usability testing was conducted in order to validate that latent usability related design faults result in a higher frequency of NFF reporting. The analysis of medical equipment maintenance data identified six devices with a high NFF reporting frequency. Semi-structured interviews, heuristic evaluations and usability testing revealed that usability issues caused a significant portion of the NFF reports. Other factors suspected to contribute to increased NFF reporting include accessory issues, intermittent faults and environmental issues. Usability testing conducted on three of the devices revealed 23 latent usability related design faults. These findings demonstrate that latent usability related design faults manifest themselves as an increase in NFF reporting and that devices containing usability related design faults can be identified through an analysis of medical equipment maintenance data.
    Journal of Biomedical Informatics 04/2014; 51. DOI:10.1016/j.jbi.2014.04.009
  • [Show abstract] [Hide abstract]
    ABSTRACT: Interpretation of cardiotocogram (CTG) is a difficult task since its evaluation is complicated by a great inter- and intra-individual variability. Previous studies have predominantly analyzed clinicians' agreement on CTG evaluation based on quantitative measures (e.g. kappa coefficient) that do not offer any insight into clinical decision making. In this paper we aim to examine the agreement on evaluation in detail and provide data-driven analysis of clinical evaluation. For this study, nine obstetricians provided clinical evaluation of 634 CTG recordings (each ca. 60min long). We studied the agreement on evaluation and its dependence on the increasing number of clinicians involved in the final decision. We showed that despite of large number of clinicians the agreement on CTG evaluations is difficult to reach. The main reason is inherent inter- and intra-observer variability of CTG evaluation. Latent class model provides better and more natural way to aggregate the CTG evaluation than the majority voting especially for larger number of clinicians. Significant improvement was reached in particular for the pathological evaluation - giving a new insight into the process of CTG evaluation. Further, the analysis of latent class model revealed that clinicians unconsciously use four classes when evaluating CTG recordings, despite the fact that the clinical evaluation was based on FIGO guidelines where three classes are defined.
    Journal of Biomedical Informatics 04/2014; 51. DOI:10.1016/j.jbi.2014.04.010
  • [Show abstract] [Hide abstract]
    ABSTRACT: A myriad of new tools and algorithms have been developed to help public health professionals analyze and visualize the complex data used in infectious disease control. To better understand approaches to meet these users' information needs, we conducted a systematic literature review focused on the landscape of infectious disease visualization tools for public health professionals, with a special emphasis on geographic information systems (GIS), molecular epidemiology, and social network analysis. The objectives of this review are to: (1) identify public health user needs and preferences for infectious disease information visualization tools; (2) identify existing infectious disease information visualization tools and characterize their architecture and features; (3) identify commonalities among approaches applied to different data types; and (4) describe tool usability evaluation efforts and barriers to the adoption of such tools. We identified articles published in English from January 1, 1980 to June 30, 2013 from five bibliographic databases. Articles with a primary focus on infectious disease visualization tools, needs of public health users, or usability of information visualizations were included in the review. A total of 88 articles met our inclusion criteria. Users were found to have diverse needs, preferences and uses for infectious disease visualization tools, and the existing tools are correspondingly diverse. The architecture of the tools was inconsistently described, and few tools in the review discussed the incorporation of usability studies or plans for dissemination. Many studies identified concerns regarding data sharing, confidentiality and quality. Existing tools offer a range of features and functions that allow users to explore, analyze, and visualize their data, but the tools are often for siloed applications. Commonly cited barriers to widespread adoption included lack of organizational support, access issues, and misconceptions about tool use. As the volume and complexity of infectious disease data increases, public health professionals must synthesize highly disparate data to facilitate communication with the public and inform decisions regarding measures to protect the public's health. Our review identified several themes: consideration of users' needs, preferences, and computer literacy; integration of tools into routine workflow; complications associated with understanding and use of visualizations; and the role of user trust and organizational support in the adoption of these tools. Interoperability also emerged as a prominent theme, highlighting challenges associated with the increasingly collaborative and interdisciplinary nature of infectious disease control and prevention. Future work should address methods for representing uncertainty and missing data to avoid misleading users as well as strategies to minimize cognitive overload. Funding for this study was provided by the NIH (Grant# 1R01LM011180-01A1).
    Journal of Biomedical Informatics 04/2014; 51. DOI:10.1016/j.jbi.2014.04.006
  • [Show abstract] [Hide abstract]
    ABSTRACT: Create an automated algorithm for predicting elderly patients' medication-related risks for readmission and validate it by comparing results with a manual analysis of the same patient population. Outcome and Assessment Information Set (OASIS) and medication data were reused from a previous, manual study of 911 patients from 15 Medicare-certified home health care agencies. The medication data was converted into standardized drug codes using APIs managed by the National Library of Medicine (NLM), and then integrated in an automated algorithm that calculates patients' high risk medication regime scores (HRMRs). A comparison of the results between algorithm and manual process was conducted to determine how frequently the HRMR scores were derived which are predictive of readmission. HRMR scores are composed of polypharmacy (number of drugs), Potentially Inappropriate Medications (PIM) (drugs risky to the elderly), and Medication Regimen Complexity Index (MRCI) (complex dose forms, instructions or administration). The algorithm produced polypharmacy, PIM, and MRCI scores that matched with 99, 87 and 99 percent of the scores, respectively, from the manual analysis. Imperfect match rates resulted from discrepancies in how drugs were classified and coded by the manual analysis vs. the automated algorithm. HRMR rules lack clarity, resulting in clinical judgments for manual coding that were difficult to replicate in the automated analysis. The high comparison rates for the three measures suggest that an automated clinical tool could use patients' medication records to predict their risks of avoidable readmissions.
    Journal of Biomedical Informatics 04/2014; 51. DOI:10.1016/j.jbi.2014.04.004
  • [Show abstract] [Hide abstract]
    ABSTRACT: Advanced Cardiac Life Support (ACLS) is a series of team-based, sequential and time constrained interventions, requiring effective communication and coordination of activities that are performed by the care provider team on a patient undergoing cardiac arrest or respiratory failure. The state-of-the-art ACLS training is conducted in a face-to-face environment under expert supervision and suffers from several drawbacks including conflicting care provider schedules and high cost of training equipment. The major objective of the study is to describe, including the design, implementation, and evaluation of a novel approach of delivering ACLS training to care providers using the proposed virtual reality simulator that can overcome the challenges and drawbacks imposed by the traditional face-to-face training method. We compare the efficacy and performance outcomes associated with traditional ACLS training with the proposed novel approach of using a virtual reality (VR) based ACLS training simulator. One hundred and forty eight (148) ACLS certified clinicians, translating into 26 care provider teams, were enrolled for this study. Each team was randomly assigned to one of the three treatment groups: control (traditional ACLS training), persuasive (VR ACLS training with comprehensive feedback components), or minimally persuasive (VR ACLS training with limited feedback components). The teams were tested across two different ACLS procedures that vary in the degree of task complexity: ventricular fibrillation or tachycardia (VFib/VTach) and pulseless electric activity (PEA). The difference in performance between control and persuasive groups was not statistically significant (P = .37 for PEA and P = .1 for VFib/VTach). However, the difference in performance between control and minimally persuasive groups was significant (P = .05 for PEA and P = .02 for VFib/VTach). The pre-post comparison of performances of the groups showed that control (P = .017 for PEA, P = .01 for VFib/VTach) and persuasive (P = .02 for PEA, P = .048 for VFib/VTach) groups improved their performances significantly, whereas minimally persuasive group did not (P = .45 for PEA, P = .46 for VFib/VTach). Results also suggest that the benefit of persuasiveness is constrained by the potentially interruptive nature of these features. Our results indicate that the VR-based ACLS training with proper feedback components can provide a learning experience similar to face-to-face training, and therefore could serve as a more easily accessed supplementary training tool to the traditional ACLS training. Our findings also suggest that the degree of persuasive features in VR environments have to be designed considering the interruptive nature of the feedback elements.
    Journal of Biomedical Informatics 04/2014; 51. DOI:10.1016/j.jbi.2014.04.005
  • [Show abstract] [Hide abstract]
    ABSTRACT: While the study of privacy preserving data publishing has drawn a lot of interest, some recent work has shown that existing mechanisms do not limit all inferences about individuals. This paper is a positive note in response to this finding. We point out that not all inference attacks should be countered, in contrast to all existing works known to us, and based on this we propose a model called SPLU. This model protects sensitive information, by which we refer to answers for aggregate queries with small sums, while queries with large sums are answered with higher accuracy. Using SPLU, we introduce a sanitization algorithm to protect data while maintaining high data utility for queries with large sums. Empirical results show that our method behaves as desired.
    Journal of Biomedical Informatics 04/2014; 50. DOI:10.1016/j.jbi.2014.04.002