Elmer Bernstam

Elmer Bernstam
  • The University of Texas Health Science Center at Houston

About

195
Publications
24,441
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
5,922
Citations
Introduction
Current institution
The University of Texas Health Science Center at Houston

Publications

Publications (195)
Preprint
Full-text available
Electronic health record (EHR) data are a rich and invaluable source of real-world clinical information, enabling detailed insights into patient populations, treatment outcomes, and healthcare practices. The availability of large volumes of EHR data are critical for advancing translational research and developing innovative technologies such as art...
Article
Full-text available
Background Scalable identification of patients with post-acute sequelae of COVID-19 (PASC) is challenging due to a lack of reproducible precision phenotyping algorithms, which has led to suboptimal accuracy, demographic biases, and underestimation of the PASC. Methods In a retrospective case-control study, we developed a precision phenotyping algo...
Article
Objective: Duplicate patient records can increase cost and medical errors. We assessed the association between demographic factors, comorbidities, healthcare usage and duplicate electronic health records. Materials and methods: We analyzed the association between duplicate patient records and multiple demographic variables (race, Hispanic ethnic...
Article
Objectives Healthcare organizations, including Clinical and Translational Science Awards (CTSA) hubs funded by the National Institutes of Health, seek to enable secondary use of electronic health record (EHR) data through an enterprise data warehouse for research (EDW4R), but optimal approaches are unknown. In this qualitative study, our goal was t...
Preprint
Scalable identification of patients with the post-acute sequelae of COVID-19 (PASC) is challenging due to a lack of reproducible precision phenotyping algorithms and the suboptimal accuracy, demographic biases, and underestimation of the PASC diagnosis code (ICD-10 U09.9). In a retrospective case-control study, we developed a precision phenotyping...
Article
Full-text available
Introduction The focus on social determinants of health (SDOH) and their impact on health outcomes is evident in U.S. federal actions by Centers for Medicare & Medicaid Services and Office of National Coordinator for Health Information Technology. The disproportionate impact of COVID-19 on minorities and communities of color heightened awareness of...
Article
Full-text available
Multiple Sclerosis (MS) is a chronic disease developed in the human brain and spinal cord, which can cause permanent damage or deterioration of the nerves. The severity of MS disease is monitored by the Expanded Disability Status Scale, composed of several functional sub-scores. Early and accurate classification of MS disease severity is critical f...
Article
Variation in availability, format, and standardization of patient attributes across health care organizations impacts patient-matching performance. We report on the changing nature of patient-matching features available from 2010–2020 across diverse care settings. We asked 38 health care provider organizations about their current patient attribute...
Article
Full-text available
Objective: Medication discrepancies between clinical systems may pose a patient safety hazard. In this paper, we identify challenges and quantify medication discrepancies across transitions of care. Materials and methods: We used structured clinical data and free-text hospital discharge summaries to compare active medications lists at four time...
Article
Full-text available
Background We propose a new deep learning model to identify unnecessary hemoglobin (Hgb) tests for patients admitted to the hospital, which can help reduce health risks and healthcare costs. Methods We collected internal patient data from a teaching hospital in Houston and external patient data from the MIMIC III database. The study used a conserv...
Preprint
Multiple Sclerosis (MS) is a chronic disease developed in human brain and spinal cord, which can cause permanent damage or deterioration of the nerves. The severity of MS disease is monitored by the Expanded Disability Status Scale (EDSS), composed of several functional sub-scores. Early and accurate classification of MS disease severity is critica...
Article
Building on previous work to define the scientific discipline of biomedical informatics, we present a framework that categorizes fundamental challenges into groups based on data, information, and knowledge, along with the transitions between these levels. We define each level and argue that the framework provides a basis for separating informatics...
Article
Full-text available
Unfractionated heparin (UFH) and low molecular weight heparin (LMWH) are often administered to prevent venous thromboembolism (VTE) in critically ill patients. However, the preferred prophylactic agent (UFH or LMWH) is not known. We compared the all-cause mortality rate in patients receiving UFH to LMWH for VTE prophylaxis. We conducted a retrospec...
Article
Full-text available
Objective: SNOMED CT is the largest clinical terminology worldwide. Quality assurance of SNOMED CT is of utmost importance to ensure that it provides accurate domain knowledge to various SNOMED CT-based applications. In this work, we introduce a deep learning-based approach to uncover missing is-a relations in SNOMED CT. Materials and methods: O...
Article
Full-text available
Objective: To evaluate tokens commonly used by clinical research consortia to aggregate clinical data across institutions. Materials and methods: This study compares tokens alone and token-based matching algorithms against manual annotation for 20,002 record pairs extracted from University of Texas Houston (UTH)'s clinical data warehouse in term...
Article
Full-text available
Background Diabetes and depression affect a significant percentage of the world’s total population, and the management of these conditions is critical for reducing the global burden of disease. Medication adherence is crucial for improving diabetes and depression outcomes, and research is needed to elucidate barriers to medication adherence, includ...
Preprint
Unnecessary laboratory tests present health risks and increase healthcare costs. We propose a new deep learning model to identify unnecessary hemoglobin (Hgb) tests for patients admitted to the hospital. Machine learning models might generate less reliable results due to noisy inputs containing low-quality information. We estimate prediction confid...
Article
Full-text available
Although pharmaceutical products undergo clinical trials to profile efficacy and safety, some adverse drug reactions (ADRs) are only discovered after release to market. Post-market drug safety surveillance - pharmacovigilance - leverages information from various sources to proactively identify such ADRs. Clinical notes are one source of observation...
Article
Full-text available
Providers currently rely on universal screening to identify health-related social needs (HRSNs). Predicting HRSNs using EHR and community-level data could be more efficient and less resource intensive. Using machine learning models, we evaluated the predictive performance of HRSN status from EHR and community-level social determinants of health (SD...
Article
Objective: Among National Institutes of Health Clinical and Translational Science Award (CTSA) hubs, effective approaches for enterprise data warehouses for research (EDW4R) development, maintenance, and sustainability remain unclear. The goal of this qualitative study was to understand CTSA EDW4R operations within the broader contexts of academic...
Article
Full-text available
Objectives Scanned documents (SDs), while common in electronic health records and potentially rich in clinically relevant information, rarely fit well with clinician workflow. Here, we identify scanned imaging reports requiring follow-up with high recall and practically useful precision. Materials and methods We focused on identifying imaging find...
Article
Full-text available
PURPOSE The Medicare Access and CHIP Reauthorization Act of 2015 (MACRA) requires eligible clinicians to report clinical quality measures (CQMs) in the Merit-Based Incentive Payment System (MIPS) to maximize reimbursement. To determine whether structured data in electronic health records (EHRs) were adequate to report MIPS CQMs, EHR data aggregated...
Article
Background Unnecessary labs contribute to iatrogenic harm and are a major source of waste in the healthcare system. We previously developed a machine learning algorithm to help clinicians identify unnecessary laboratory tests, but it has not been externally validated. In this study, we externally validate our ML algorithm. Methods To externally va...
Article
Objectives Electronic health records (EHRs) contain a large quantity of machine-readable data. However, institutions choose different EHR vendors, and the same product may be implemented differently at different sites. Our goal was to quantify the interoperability of real-world EHR implementations with respect to clinically relevant structured data...
Article
Introduction In the context of competency-based medical education, poor student performance must be accurately documented to allow learners to improve and to protect the public. However, faculty may be reluctant to provide evaluations that could be perceived as negative, and clerkship directors report that some students pass who should have failed....
Article
Background: Diabetes and depression affect a significant percentage of total world’s population, and the management of these conditions is critical for reducing the global burden of disease. Medication adherence is critical for improving diabetes and depression outcomes, and research is needed to elucidate barriers to medication adherence, includin...
Article
Full-text available
Artificial intelligence (AI) is transforming many domains including finance, agriculture, defense and biomedicine. In this paper, we focus on the role of AI in clinical and translational research (CTR) including pre‐clinical research (T1), clinical research (T2), clinical implementation (T3) and public (or population) health (T4). Given the rapid e...
Article
Background and Aims Endoscopic ultrasound (EUS), magnetic resonance cholangiopancreatography (MRCP), and intraoperative cholangiogram (IOC) are the recommended diagnostic modalities for patients with intermediate probability for choledocholithiasis (IPC). The relative cost-effectiveness of these modalities in patients with cholelithiasis and IPC is...
Article
Full-text available
The comprehensive characterization of clinical and severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) testing data for patients with repeatedly positive SARS-CoV-2 tests can help prioritize suspected cases of reinfection for investigation in the absence of sequencing data and for continued surveillance of the potential long-term health co...
Article
Objective: Expand Operative Stress Score (OSS) increasing procedural coverage and assessing OSS and frailty association with Preoperative Acute Serious Conditions (PASC), complications and mortality in females versus males. Summary background data: Veterans Affairs male-dominated study showed high mortality in frail veterans even after very low...
Preprint
Background In the absence of genome sequencing, two positive molecular SARS-CoV-2 tests separated by negative tests, prolonged time, and symptom resolution remain the best surrogate measure of possible re-infection. Methods Using a large electronic health record database, we characterized clinical and testing data for 23 patients with repeatedly p...
Article
We assessed the scalability of pharmacological signal detection use case from a single-site CDW to a large aggregated clinical data warehouse (single-site database with 754,214 distinct patient IDs vs. multisite database with 49.8M). We aimed to explore whether a larger clinical dataset would provide clearer signals for secondary analyses such as d...
Article
The paradigm of representation learning through transfer learning has the potential to greatly enhance clinical natural language processing. In this work, we propose a multi-task pre-training and fine-tuning approach for learning generalized and transferable patient representations from medical language. The model is first pre-trained with differen...
Article
Full-text available
Introduction Drug safety research asks causal questions but relies on observational data. Confounding bias threatens the reliability of studies using such data. The successful control of confounding requires knowledge of variables called confounders affecting both the exposure and outcome of interest. Causal knowledge of dynamic biological systems...
Preprint
The paradigm of representation learning through transfer learning has the potential to greatly enhance clinical natural language processing. In this work, we propose a multi-task pre-training and fine-tuning approach for learning generalized and transferable patient representations from medical language. The model is first pre-trained with differen...
Article
Objective To build a machine-learning model that predicts laboratory test results and provides a promising lab test reduction strategy, using spatial-temporal correlations. Materials and methods We developed a global prediction model to treat laboratory testing as a series of decisions by considering contextual information over time and across mod...
Article
Objectives Electronic Health Records (EHRs) contain scanned documents from a variety of sources such as identification cards, radiology reports, clinical correspondence, and many other document types. We describe the distribution of scanned documents at one health institution and describe the design and evaluation of a system to categorize document...
Article
Background AD is a devastating disease and its pathophysiology is still largely unknown. No treatment has been shown to be efficacious, so prevention remains a very valuable approach. The objective of this work is to statistically test the relationship between influenza vaccination and the incidence of AD to identify a candidate for AD prevention....
Article
Background and aims: The American Society for Gastrointestinal Endoscopy (ASGE) 2010 guidelines for suspected choledocholithiasis were recently updated by proposing more-specific criteria for selection of high-risk patients to undergo direct ERCP, while advocating use of additional imaging studies for intermediate- and low-risk individuals. We aim...
Article
Full-text available
Defining patient-to-patient similarity is essential for the development of precision medicine in clinical care and research. Conceptually, the identification of similar patient cohorts appears straightforward; however, universally accepted definitions remain elusive. Simultaneously, an explosion of vendors and published algorithms have emerged and...
Article
PURPOSE: Genomic analysis of individual patients is now affordable, and therapies targeting specific molecular aberrations are being tested in clinical trials. Genomically-informed therapy is relevant to many clinical domains, but is particularly applicable to cancer treatment. However, even specialized clinicians need help to interpret genomic dat...
Preprint
Full-text available
Introduction: Confounding bias threatens the reliability of observational studies and poses a significant scientific challenge. This paper introduces a framework for identifying confounding factors by exploiting literature-derived computable knowledge. In previous work, we have shown that semantic constraint search over computable knowledge extract...
Article
Full-text available
Global pandemics call for large and diverse healthcare data to study various risk factors, treatment options, and disease progression patterns. Despite the enormous efforts of many large data consortium initiatives, scientific community still lacks a secure and privacy-preserving infrastructure to support auditable data sharing and facilitate autom...
Article
Full-text available
Global pandemics call for large and diverse healthcare data to study various risk factors, treatment options, and disease progression patterns. Despite the enormous efforts of many large data consortium initiatives, the scientific community still lacks a secure and privacy- preserving infrastructure to support auditable data sharing and facilitate...
Article
Full-text available
Importance Suicide is a leading cause of mortality, with suicide-related deaths increasing in recent years. Automated methods for individualized risk prediction have great potential to address this growing public health threat. To facilitate their adoption, they must first be validated across diverse health care settings. Objective To evaluate the...
Article
Serial laboratory testing is common, especially in Intensive Care Units (ICU). Such repeated testing is expensive and may even harm patients. However, identifying specific tests that can be omitted is challenging. The search space of different lab tests is large and the optimal reduction is hard to determine without modeling the time trajectory of...
Article
Objective: There is a lot of information about cancer in Electronic Health Record (EHR) notes that can be useful for biomedical research provided natural language processing (NLP) methods are available to extract and structure this information. In this paper, we present a scoping review of existing clinical NLP literature for cancer. Methods: We...
Article
Full-text available
The well-known hazards of repurposing data make Data Quality (DQ) assessment a vital step towards ensuring valid results regardless of analytical methods. However, there is no systematic process to implement DQ assessments for secondary uses of clinical data. This paper presents DataGauge, a systematic process for designing and implementing DQ asse...
Article
Full-text available
Objective: The study sought to design, pilot, and evaluate a federated data completeness tracking system (CTX) for assessing completeness in research data extracted from electronic health record data across the Accessible Research Commons for Health (ARCH) Clinical Data Research Network. Materials and methods: The CTX applies a systems-based app...
Article
Purpose: Many targeted therapies are currently available only via clinical trials. Therefore, routine precision oncology using biomarker-based assignment to drug depends on matching patients to clinical trials. A comprehensive and up-to-date trial database is necessary for optimal patient-trial matching. Methods: We describe processes for establ...
Article
e18080 Background: Implementation of electronic health records (EHRs) has engendered a large quantity of machine-readable data. However, different practices choose different EHR vendors and the same vendor product may be implemented differently at each practice. Motivated by the desire to facilitate appropriate integration of data, our goal was to...
Article
e18074 Background: Physician reimbursement for care delivered to Medicare beneficiaries fundamentally changed with the 2015 MACRA legislation, requiring eligible clinicians to report quality measures in the Merit-Based Incentive Payment System (MIPS). To determine whether structured data in electronic health records (EHRs) were adequate to report M...
Data
Data S1. Clinical and administrative data reuse for research protocol.
Preprint
Objective: There is a lot of information about cancer in Electronic Health Record (EHR) notes that can be useful for biomedical research provided natural language processing (NLP) methods are available to extract and structure this information. In this paper, we present a scoping review of existing clinical NLP literature for cancer. Methods: We id...
Article
Full-text available
Electronic health records are valuable for clinical and translational research. Institutions must protect patient privacy and comply with applicable regulations while allowing appropriate access to clinical data for research. The processes that investigators must follow to access clinical data can be substantially different at different institution...
Article
Purpose: We examined patterns, correlates, and the impact of cancer-related Internet use among patients with advanced cancer in a phase I clinical trials clinic for molecularly targeted oncologic agents. Methods: An anonymous questionnaire on Internet use for cancer-related purposes that incorporated input from phase I clinical trial oncologists...
Article
Full-text available
There are an ever-increasing number of reports and commentaries that describe the challenges and opportunities associated with the use of big data and data science (DS) in the context of biomedical education, research, and practice. These publications argue that there are substantial benefits resulting from the use of data-centric approaches to sol...
Article
Full-text available
Background: The role of cancer-related internet use on the patient-physician relationship has not been adequately explored among patients who are cancer-related internet users (CIUs) in early-phase clinical trial clinics. Objective: We examined the association between cancer-related internet use and the patient-physician relationship and decisio...
Preprint
BACKGROUND The role of cancer-related internet use on the patient-physician relationship has not been adequately explored among patients who are cancer-related internet users (CIUs) in early-phase clinical trial clinics. OBJECTIVE We examined the association between cancer-related internet use and the patient-physician relationship and decision ma...
Article
With the increasing availability of genomics, routine analysis of advanced cancers is now feasible. Treatment selection is frequently guided by the molecular characteristics of a patient's tumor, and an increasing number of trials are genomically-selected. Furthermore, multiple studies have demonstrated the benefit of therapies that are chosen base...
Article
Background: Genomic testing is increasingly performed in oncology, but concerns remain regarding the clinician's ability to interpret results. In the current study, the authors sought to determine the agreement between physicians and genomic annotators from the Precision Oncology Decision Support (PODS) team at The University of Texas MD Anderson...
Article
High-throughput genomic and molecular profiling of tumors is emerging as an important clinical approach. Molecular profiling is increasingly being used to guide cancer patient care, especially in advanced and incurable cancers. However, navigating the scientific literature to make evidence-based clinical decisions based on molecular profiling resul...
Article
Objective: One promise of nationwide adoption of electronic health records (EHRs) is the availability of data for large-scale clinical research studies. However, because the same patient could be treated at multiple health care institutions, data from only a single site might not contain the complete medical history for that patient, meaning that...
Article
Full-text available
At the ASCO Data Standards and Interoperability Summit held in May 2016, it was unanimously decided that four areas of current oncology clinical practice have serious, unmet health information technology needs. The following areas of need were identified: 1) omics and precision oncology, 2) advancing interoperability, 3) patient engagement, and 4)...
Article
Full-text available
In the information age, we expect data systems to make us more effective and efficient-not to make our lives more difficult! In this article, we discuss how we are using data systems, such as electronic health records (EHRs), to improve care delivery. We illustrate how US Oncology is beginning to use real-world evidence to facilitate trial accrual...
Article
In the information age, we expect data systems to make us more effective and efficient-not to make our lives more difficult! In this article, we discuss how we are using data systems, such as electronic health records (EHRs), to improve care delivery. We illustrate how US Oncology is beginning to use real-world evidence to facilitate trial accrual...
Article
Full-text available
Background: Patient matching is a key barrier to achieving interoperability. Patient demographic elements must be consistently collected over time and region to be valuable elements for patient matching. Objectives: We sought to determine what patient demographic attributes are collected at multiple institutions in the United States and see how...
Article
Full-text available
Purpose: Molecular profiling performed in the research setting usually does not benefit the patients that donate their tissues. Through a prospective protocol, we sought to determine the feasibility and utility of performing broad genomic testing in the research laboratory for discovery, and the utility of giving treating physicians access to rese...
Article
Full-text available
Observational data recorded in the Electronic Health Record (EHR) can help us better understand the effects of therapeutic agents in routine clinical practice. As such data were not collected for research purposes, their reuse for research must compensate for additional information that may bias analyses and lead to faulty conclusions. Confounding...
Conference Paper
Full-text available
Observational data recorded in the Electronic Health Record (EHR) can help us better understand the effects of therapeutic agents in routine clinical practice. As such data were not collected for research purposes, their reuse for research must compensate for additional information that may bias analyses and ultimately lead to faulty conclusions. C...
Article
This special issue on precision medicine informatics flowed from the AMIA 2015 Translational Bioinformatics Summit theme of “Accelerating Precision Medicine”1 and President Obama’s 2015 State of the Union call “to give all of us access to the personalized information we need to keep ourselves and our families healthier.”2 The goal is to focus on th...
Article
Full-text available
Objective: Clinical trials investigating drugs that target specific genetic alterations in tumors are important for promoting personalized cancer therapy. The goal of this project is to create a knowledge base of cancer treatment trials with annotations about genetic alterations from ClinicalTrials.gov. Methods: We developed a semi-automatic fra...
Article
Introduction Genomic profiling information is frequently available to oncologists, enabling targeted cancer therapy. Because clinically relevant information is rapidly emerging in the literature and elsewhere, there is a need for informatics technologies to support targeted therapies. To this end, we have developed a system for Automated Identifica...
Article
Objective: To evaluate whether vector representations encoding latent topic proportions that capture similarities to MeSH terms can improve performance on biomedical document retrieval and classification tasks, compared to using MeSH terms. Materials and methods: We developed the TopicalMeSH representation, which exploits the 'correspondence' be...
Article
Background: Understanding patients' knowledge and prior information-seeking regarding personalized cancer therapy (PCT) may inform future patient information systems, consent for molecular testing and PCT protocols. We evaluated breast cancer patients' knowledge and information-seeking behaviors regarding PCT. Methods: Newly registered female br...
Article
Large clinical datasets can be used to discover and monitor drug side effects. Many previous studies analyzed symptom data as discrete events. However, some drug side effects are inferred from continuous variables such as weight or blood pressure. These require additional assumptions for analysis. For example, we can define positive/negative thresh...
Article
Full-text available
Rapidly improving understanding of molecular oncology, emerging novel therapeutics, and increasingly available and affordable next-generation sequencing have created an opportunity for delivering genomically informed personalized cancer therapy. However, to implement genomically informed therapy requires that a clinician interpret the patient's mol...
Article
Automatically identifying specific phenotypes in free-text clinical notes is critically important for the reuse of clinical data. In this study, the authors combine expert-guided feature (text) selection with one-class classification for text processing. To compare the performance of one-class classification to traditional binary classification; to...
Article
Full-text available
Thirty-Seventh Annual CTRC-AACR San Antonio Breast Cancer Symposium; December 9-13, 2014; San Antonio, TX INTRODUCTION: Breast cancer patients and providers are increasingly interested in personalized cancer therapy. Information-seeking behaviors and knowledge about personalized cancer therapy, cancer genetics, and molecular testing may influence...
Article
Full-text available
Ambiguous gene names in the biomedical literature are a barrier to accurate information extraction. To overcome this hurdle, we generated Ontology Fingerprints for selected genes that are relevant for personalized cancer therapy. These Ontology Fingerprints were used to evaluate the association between genes and biomedical literature to disambiguat...
Article
In alignment with a major shift toward patient-centered care as the model for improving care in our health system, informatics is transforming patient-provider relationships and overall care delivery. AMIA's 2013 Health Policy Invitational was focused on examining existing challenges surrounding full engagement of the patient and crafting a researc...
Article
Full-text available
BACKGROUND This study assessed attitudes of breast cancer patients toward molecular testing for personalized therapy and research.METHODSA questionnaire was given to female breast cancer patients presenting to a cancer center. Associations between demographic and clinical variables and attitudes toward molecular testing were evaluated.RESULTSThree...

Network

Cited By