Journal of Biomedical Informatics (J Biomed Informat)

Publisher: Elsevier

Journal description

The Journal of Biomedical Informatics (formerly Computers and Biomedical Research) has been redesigned to reflect a commitment to high-quality original research papers and reviews in the area of biomedical informatics. Although published articles are motivated by applications in the biomedical sciences (for example, clinical medicine, health care, population health, imaging, and bioinformatics), the journal emphasizes reports of new methodologies and techniques that have general applicability and that form the basis for the evolving science of biomedical informatics. Articles on medical devices, and formal evaluations of completed systems, including clinical trials of information technologies, would generally be more suitable for publication in other venues. System descriptions are welcome if they illustrate and substantiate the underlying methodology that is the principal focus of the report.

Current impact factor: 2.19

Impact Factor Rankings

2016 Impact Factor Available summer 2017
2014 / 2015 Impact Factor 2.194
2013 Impact Factor 2.482
2012 Impact Factor 2.131
2011 Impact Factor 1.792
2010 Impact Factor 1.719
2009 Impact Factor 2.432
2008 Impact Factor 1.924
2007 Impact Factor 2
2006 Impact Factor 2.346
2005 Impact Factor 2.388
2004 Impact Factor 1.013
2003 Impact Factor 0.855
2002 Impact Factor 0.862

Impact factor over time

Impact factor

Additional details

5-year impact 3.44
Cited half-life 5.20
Immediacy index 0.61
Eigenfactor 0.01
Article influence 1.14
Website Journal of Biomedical Informatics website
Other titles Journal of biomedical informatics (Online)
ISSN 1532-0480
OCLC 45147742
Material type Document, Periodical, Internet resource
Document type Internet Resource, Computer File, Journal / Magazine / Newspaper

Publisher details


  • Pre-print
    • Author can archive a pre-print version
  • Post-print
    • Author can archive a post-print version
  • Conditions
    • Authors pre-print on any website, including arXiv and RePEC
    • Author's post-print on author's personal website immediately
    • Author's post-print on open access repository after an embargo period of between 12 months and 48 months
    • Permitted deposit due to Funding Body, Institutional and Governmental policy or mandate, may be required to comply with embargo periods of 12 months to 48 months
    • Author's post-print may be used to update arXiv and RepEC
    • Publisher's version/PDF cannot be used
    • Must link to publisher version with DOI
    • Author's post-print must be released with a Creative Commons Attribution Non-Commercial No Derivatives License
    • Publisher last reviewed on 03/06/2015
  • Classification

Publications in this journal

  • [Show abstract] [Hide abstract]
    ABSTRACT: Computerized survival prediction in healthcare identifying the risk of disease mortality, helps healthcare providers to effectively manage their patients by providing appropriate treatment options. In this study, we propose to apply a classification algorithm, Contrast Pattern Aided Logistic Regression (CPXR(Log)) with the probabilistic loss function, to develop and validate prognostic risk models to predict 1, 2, and 5 year survival in heart failure (HF) using data from electronic health records (EHRs) at Mayo Clinic. The CPXR(Log) constructs a pattern aided logistic regression model defined by several patterns and corresponding local logistic regression models. One of the models generated by CPXR (Log) achieved an AUC and accuracy of 0.94 and 0.91, respectively, and significantly outperformed prognostic models reported in prior studies. Data extracted from EHRs allowed incorporation of patient co-morbidities into our models which helped improve the performance of the CPXR(Log) models (15.9% AUC improvement), although did not improve the accuracy of the models built by other classifiers. We also propose a probabilistic loss function to determine the large error and small error instances. The new loss function used in the algorithm outperforms other functions used in the previous studies by 1% improvement in the AUC. This study revealed that using EHR data to build prediction models can be very challenging using existing classification methods due to the high dimensionality and complexity of EHR data. The risk models developed by CPXR(Log) also reveal that HF is a highly heterogeneous disease, i.e., different subgroups of HF patients require different types of considerations with their diagnosis and treatment. Our risk models provided two valuable insights for application of predictive modeling techniques in biomedicine: Logistic risk models often make systematic prediction errors, and it is prudent to use subgroup based prediction models such as those given by CPXR(Log) when investigating heterogeneous diseases.
    No preview · Article · Feb 2016 · Journal of Biomedical Informatics
  • [Show abstract] [Hide abstract]
    ABSTRACT: Wide-scale adoption of electronic medical records (EMRs) has created an unprecedented opportunity for the implementation of Rapid Learning Systems (RLSs) that leverage primary clinical data for real-time decision support. In cancer, where large variations among patient features leave gaps in traditional forms of medical evidence, the potential impact of a RLS is particularly promising. We developed the Melanoma Rapid Learning Utility (MRLU), a component of the RLS, providing an analytical engine and user interface that enables physicians to gain clinical insights by rapidly identifying and analyzing cohorts of patients similar to their own.
    No preview · Article · Feb 2016 · Journal of Biomedical Informatics
  • [Show abstract] [Hide abstract]
    ABSTRACT: The atrocious behavioral and physiological shift with aging accelerate occurrence of deleterious disorders. Contemporary research is focused at uncovering the role of genetic associations in Age-related Disorders (ARDs). While the completion of the Human Genome Project and the HapMap project has generated huge amount of data on genetic variations; Genome-Wide Association Studies (GWAS) have identified genetic variations, essentially SNPs associated with several disorders including ARDs. However, a repository that houses all such ARD associations is lacking. The present work is aimed at filling this void. A database, dbAARD (database of Aging and Age Related Disorders) has been developed which hosts information on more than 3000 genetic variations significantly (p-value < 0.05) associated with 51 ARDs. Furthermore, a machine learning based gene prediction tool AGP (Age Related Disorders Gene Prediction) has been constructed by employing rotation forest algorithm, to prioritize genes associated with ARDs. The tool achieved an overall accuracy in terms of precision 75%, recall 76%, F-measure 76% and AUC 0.85. Both the web resources have been made available online at and respectively for easy retrieval and usage by the scientific community. We believe that this work may facilitate the analysis of plethora of variants associated with ARDs and provide cues for deciphering the biology of aging.
    No preview · Article · Feb 2016 · Journal of Biomedical Informatics
  • [Show abstract] [Hide abstract]
    ABSTRACT: Multi-site Institutional Review Board (IRB) review of clinical research projects is an important but complex and time-consuming activity that is hampered by disparate non-interoperable computer systems for management of IRB applications. This paper describes our work toward harmonizing the workflow and data model of IRB applications through the development of a software-as-a-service shared-IRB platform for five institutions in South Carolina. Several commonalities and differences were recognized across institutions and a core data model that included the data elements necessary for IRB applications across all institutions was identified. We extended and modified the system to support collaborative reviews of IRB proposals within routine workflows of participating IRBs. Overall about 80% of IRB application content was harmonized across all institutions, establishing the foundation for a streamlined cooperative review and reliance. Since going live in 2011, 49 applications that underwent cooperative reviews over a three year period were approved, with the majority involving 2 out of 5 institutions. We believe this effort will inform future work on a common IRB data model that will allow interoperability through a federated approach for sharing IRB reviews and decisions with the goal of promoting reliance across institutions in the translational research community at large.
    No preview · Article · Jan 2016 · Journal of Biomedical Informatics
  • [Show abstract] [Hide abstract]
    ABSTRACT: Objectives: Increased adoption of electronic health records has resulted in increased availability of free text clinical data for secondary use. A variety of approaches to obtain actionable information from unstructured free text data exist. These approaches are resource intensive, inherently complex and rely on structured clinical data and dictionary-based approaches. We sought to evaluate the potential to obtain actionable information from free text pathology reports using routinely available tools and approaches that do not depend on dictionary-based approaches. Materials and methods: We obtained pathology reports from a large health information exchange and evaluated the capacity to detect cancer cases from these reports using 3 non-dictionary feature selection approaches, 4 feature subset sizes, and 5 clinical decision models: simple logistic regression, naïve bayes, k-nearest neighbor, random forest, and J48 decision tree. The performance of each decision model was evaluated using sensitivity, specificity, accuracy, positive predictive value, and area under the receiver operating characteristics (ROC) curve. Results: Decision models parameterized using automated, informed, and manual feature selection approaches yielded similar results. Furthermore, non-dictionary classification approaches identified cancer cases present in free text reports with evaluation measures approaching and exceeding 80% to 90% for most metrics. Conclusion: Our methods are feasible and practical approaches for extracting substantial information value from free text medical data, and the results suggest that these methods can perform on par, if not better, than existing dictionary-based approaches. Given that public health agencies are often under-resourced and lack the technical capacity for more complex methodologies, these results represent potentially significant value to the public health field.
    No preview · Article · Jan 2016 · Journal of Biomedical Informatics
  • [Show abstract] [Hide abstract]
    ABSTRACT: Health insurers maintain large databases containing information on medical services utilized by claimants, often spanning several healthcare services and providers. Proper use of these databases could facilitate better clinical and administrative decisions. In these data sets, there exists many unequally spaced events, such as hospital visits. However, data mining of temporal data and point processes is still a developing research area and extracting useful information from such data series is a challenging task. In this paper, we developed a time series data mining approach to predict the number of days in hospital in the coming year for individuals from a general insured population based on their insurance claim data. In the proposed method, the data were windowed at four different timescales (bi-monthly, quarterly, half-yearly and yearly) to construct regularly spaced time series features extracted from such events, resulting in four associated prediction models. A comparison of these models indicates models using a half-yearly windowing scheme delivers the best performance on all three populations (the whole population, a senior sub-population and a non-senior sub-population). The superiority of the half-yearly model was found to be particularly pronounced in the senior sub-population. A bagged decision tree approach was able to predict ‘no hospitalization’ versus ‘at least one day in hospital’ with a Matthews correlation coefficient (MCC) of 0.426. This was significantly better than the corresponding yearly model, which achieved 0.375 for this group of customers. Further reducing the length of the analysis windows to three or two months did not produce further improvements.
    No preview · Article · Jan 2016 · Journal of Biomedical Informatics
  • [Show abstract] [Hide abstract]
    ABSTRACT: Objectives: To critically identify studies that evaluate the effects of cueing in virtual motor rehabilitation in patients having different neurological disorders and to make recommendations for future studies. Methods: Data from MEDLINE®, IEEExplore, Science Direct, Cochrane library and Web of Science was searched until February 2015. We included studies that investigate the effects of cueing in virtual motor rehabilitation related to interventions for upper or lower extremities using auditory, visual, and tactile cues on motor performance in non-immersive, semi-immersive, or fully immersive virtual environments. These studies compared virtual cueing with an alternative or no intervention. Results: Ten studies with a total number of 153 patients were included in the review. All of them refer to the impact of cueing in virtual motor rehabilitation, regardless of the pathological condition. After selecting the articles, the following variables were extracted: year of publication, sample size, study design, type of cueing, intervention procedures, outcome measures, and main findings. The outcome evaluation was done at baseline and end of the treatment in most of the studies. All of studies except one showed improvements in some or all outcomes after intervention, or, in some cases, in favor of the virtual rehabilitation group compared to the control group. Conclusions: Virtual cueing seems to be a promising approach to improve motor learning, providing a channel for non-pharmacological therapeutic intervention in different neurological disorders. However, further studies using larger and more homogeneous groups of patients are required to confirm these findings.
    No preview · Article · Jan 2016 · Journal of Biomedical Informatics
  • [Show abstract] [Hide abstract]
    ABSTRACT: Genomics is a promising tool that is becoming more widely available to improve the care and treatment of individuals. While there is much assertion, genomics will most certainly require the use of clinical decision support (CDS) to be fully realized in the routine clinical setting. The National Human Genome Research Institute (NHGRI) of the National Institutes of Health recently convened an in-person, multi-day meeting on this topic. It was widely recognized that there is a need to promote the innovation and development of resources for genomic CDS such as a CDS sandbox. The purpose of this study was to evaluate a proposed approach for such a genomic CDS sandbox among domain experts and potential users. Survey results indicate a significant interest and desire for a genomic CDS sandbox environment among domain experts. These results will be used to guide the development of a genomic CDS sandbox.
    No preview · Article · Jan 2016 · Journal of Biomedical Informatics
  • [Show abstract] [Hide abstract]
    ABSTRACT: Magnetic resonance guided focused ultrasound surgery (MRgFUS) has become an attractive, non-invasive treatment for benign and malignant tumours, and offers specific benefits for poorly accessible locations in the liver. However, the presence of the ribcage and the occurrence of liver motion due to respiration limit the applicability MRgFUS. Several techniques are being developed to address these issues or to decrease treatment times in other ways. However, the potential benefit of such improvements has not been quantified. In this research, the detailed workflow of current MRgFUS procedures was determined qualitatively and quantitatively by using observation studies on uterine MRgFUS interventions, and the bottlenecks in MRgFUS were identified. A validated simulation model based on discrete events simulation was developed to quantitatively predict the effect of new technological developments on the intervention duration of MRgFUS on the liver. During the observation studies, the duration and occurrence frequencies of all actions and decisions in the MRgFUS workflow were registered, as were the occurrence frequencies of motion detections and intervention halts. The observation results show that current MRgFUS uterine interventions take on average 213 minutes. Organ motion was detected on average 2.9 times per intervention, of which on average 1.0 actually caused a need for rework. Nevertheless, these motion occurrences and the actions required to continue after their detection consumed on average 11% and up to 29% of the total intervention duration. The simulation results suggest that, depending on the motion occurrence frequency, the addition of new technology to automate currently manual MRgFUS tasks and motion compensation could potentially reduce the intervention durations by 98.4% (from 256 hours 5 minutes to 4 hours 4 minutes) in the case of 90% motion occurrence, and with 24% (from 5 hours 19 minutes to 4 hours 2 minutes) in the case of no motion. In conclusion, new tools were developed to predict how intervention durations will be affected by future workflow changes and by the introduction of new technology.
    No preview · Article · Jan 2016 · Journal of Biomedical Informatics
  • [Show abstract] [Hide abstract]
    ABSTRACT: Objective: To design and assess a method for extracting clinically useful sentences from synthesized online clinical resources that represent the most clinically useful information for directly answering clinicians' information needs. Materials and methods: We developed a Kernel-based Bayesian Network classification model based on different domain-specific feature types extracted from sentences in a gold standard composed of 18 UpToDate documents. These features included UMLS concepts and their semantic groups, semantic predications extracted by SemRep, patient population identified by a pattern-based natural language processing (NLP) algorithm, and cue words extracted by a feature selection technique. Algorithm performance was measured in terms of precision, recall, and F-measure. Results: The feature-rich approach yielded an F-measure of 74% versus 37% for a feature co-occurrence method (p<0.001). Excluding predication, population, semantic concept or text-based features reduced the F-measure to 62%, 66%, 58% and 69% respectively (p<0.01). The classifier applied to Medline sentences reached an F-measure of 73%, which is equivalent to the performance of the classifier on UpToDate sentences (p=0.62). Conclusions: The feature-rich approach significantly outperformed general baseline methods. This approach significantly outperformed classifiers based on a single type of feature. Different types of semantic features provided a unique contribution to overall classification performance. The classifier's model and features used for UpToDate generalized well to Medline abstracts.
    No preview · Article · Jan 2016 · Journal of Biomedical Informatics
  • [Show abstract] [Hide abstract]
    ABSTRACT: The identification of similar entities represented by records in different databases has drawn considerable attention in many application areas, including in the health domain. One important type of entity matching application that is vital for quality healthcare analytics is the identification of similar patients, known as similar patient matching. A key component of identifying similar records is the calculation of similarity of the values in attributes (fields) between these records. Due to increasing privacy and confidentiality concerns, using the actual attribute values of patient records to identify similar records across different organizations is becoming non-trivial because the attributes in such records often contain highly sensitive information such as personal and medical details of patients. Therefore, the matching needs to be based on masked (encoded) values while being effective and efficient to allow matching of large databases. Bloom filter encoding has widely been used as an efficient masking technique for privacy-preserving matching of string and categorical values. However, no work on Bloom filter-based masking of numerical data, such as integer (e.g. age), floating point (e.g. body mass index), and modulus (numbers wrap around upon reaching a certain value, e.g. date and time), which are commonly required in the health domain, has been presented in the literature. We propose a framework with novel methods for masking numerical data using Bloom filters, thereby facilitating the calculation of similarities between records. We conduct an empirical study on publicly available real-world datasets which shows that our framework provides efficient masking and achieves similar matching accuracy compared to the matching of actual unencoded patient records.
    No preview · Article · Dec 2015 · Journal of Biomedical Informatics
  • [Show abstract] [Hide abstract]
    ABSTRACT: Big data technologies are critical to the medical field which requires new frameworks to leverage them. Such frameworks would benefit medical experts to test hypotheses by querying huge volumes of unstructured medical data to provide better patient care. The objective of this work is to implement and examine the feasibility of having such a framework to provide efficient querying of unstructured data in unlimited ways. The feasibility study was conducted specifically in the epilepsy field. The proposed framework evaluates a query in two phases. In phase 1, structured data is used to filter the clinical data warehouse. In phase 2, feature extraction modules are executed on the unstructured data in a distributed manner via Hadoop to complete the query. Three modules have been created, volume comparer, surface to volume conversion and average intensity. The framework allows for user-defined modules to be imported to provide unlimited ways to process the unstructured data hence potentially extending the application of this framework beyond epilepsy field. Two types of criteria were used to validate the feasibility of the proposed framework - the ability/accuracy of fulfilling an advanced medical query and the efficiency that Hadoop provides. For the first criterion, the framework executed an advanced medical query that spanned both structured and unstructured data with accurate results. For the second criterion, different architectures were explored to evaluate the performance of various Hadoop configurations and were compared to a traditional Single Server Architecture (SSA). The surface to volume conversion module performed up to 40 times faster than the SSA (using a 20 node Hadoop cluster) and the average intensity module performed up to 85 times faster than the SSA (using a 40 node Hadoop cluster). Furthermore, the 40 node Hadoop cluster executed the average intensity module on 10,000 models in 3 hours which was not even practical for the SSA. The current study is limited to epilepsy field and further research and more feature extraction modules are required to show its applicability in other medical domains. The proposed framework advances data-driven medicine by unleashing the content of unstructured medical data in an efficient and unlimited way to be harnessed by medical experts.
    No preview · Article · Dec 2015 · Journal of Biomedical Informatics
  • [Show abstract] [Hide abstract]
    ABSTRACT: Due to the lack of an internationally accepted and adopted standard for coding health interventions, Austria has established its own country-specific procedure classification system – the Austrian Procedure Catalogue (APC). Even though the APC is an elaborate coding standard for medical procedures, it has shortcomings that limit its usability. In order to enhance usability and usefulness, especially for research purposes and e-health applications, we developed an ontologized version of the APC. In this paper we present a novel four-step approach for the ontology engineering process, which enables accurate extraction of relevant concepts for medical ontologies from written text.
    No preview · Article · Dec 2015 · Journal of Biomedical Informatics