Benjamin Glicksberg

Benjamin Glicksberg
Icahn School of Medicine at Mount Sinai | MSSM · Department of Genetics and Genomic Sciences

PhD

About

422
Publications
71,823
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
13,840
Citations

Publications

Publications (422)
Article
Full-text available
Precision medicine can utilize new techniques in order to more effectively translate research findings into clinical practice. In this article, we first explore the limitations of traditional study designs, which stem from (to name a few): massive cost for the assembly of large patient cohorts; non-representative patient data; and the astounding co...
Article
Full-text available
Motivation: Electronic Health Records (EHR) are quickly becoming omnipresent in healthcare, but interoperability issues and technical demands limit their use for biomedical and clinical research. Interactive and flexible software that interfaces directly with EHR data structured around a common data model could accelerate more EHR-based research b...
Article
Full-text available
Background Artificial intelligence (AI) and large language models (LLMs) can play a critical role in emergency room operations by augmenting decision-making about patient admission. However, there are no studies for LLMs using real-world data and scenarios, in comparison to and being informed by traditional supervised machine learning (ML) models....
Preprint
Full-text available
Large language models (LLMs) show promising accuracy on challenging tasks, including medical question answering. Yet, direct gains from model upgrades can plateau, and reliability issues persist. We introduce Iterative Consensus Ensemble (ICE), a proof-of-concept framework that refines answers through iterative reasoning and feedback among multiple...
Article
Full-text available
Diagnostic imaging is an integral part of identifying spondyloarthropathies (SpA), yet the interpretation of these images can be challenging. This review evaluated the use of deep learning models to enhance the diagnostic accuracy of SpA imaging. Following PRISMA guidelines, we systematically searched major databases up to February 2024, focusing o...
Article
Full-text available
Objective: This systematic review evaluates the current applications, advantages, and challenges of large language models (LLMs) in melanoma care. Methods: A systematic search was conducted in PubMed and Scopus databases for studies published up to 23 July 2024, focusing on the application of LLMs in melanoma. The review adhered to PRISMA guideline...
Preprint
Full-text available
Generative AI is transforming enterprise application development by enabling machines to create content, code, and designs. These models, however, demand substantial computational power and data management. Cloud computing addresses these needs by offering infrastructure to train, deploy, and scale generative AI models. This review examines cloud s...
Article
Full-text available
This review analyzes current clinical trials investigating large language models’ (LLMs) applications in healthcare. We identified 27 trials (5 published and 22 ongoing) across 4 main clinical applications: patient care, data handling, decision support, and research assistance. Our analysis reveals diverse LLM uses, from clinical documentation to m...
Article
Full-text available
Large language models (LLMs) can optimize clinical workflows; however, the economic and computational challenges of their utilization at the health system scale are underexplored. We evaluated how concatenating queries with multiple clinical notes and tasks simultaneously affects model performance under increasing computational loads. We assessed t...
Article
Full-text available
[This corrects the article DOI: 10.1371/journal.pone.0275004.].
Preprint
Full-text available
Background and Aim: Multimodal large language models (LLMs) have shown potential in processing both text and image data for clinical applications. This study evaluated their diagnostic performance in identifying retinal diseases from optical coherence tomography (OCT) images. Methods: We assessed the diagnostic accuracy of GPT-4o and Claude Sonnet...
Preprint
Full-text available
Importance: Medical ethics is inherently complex, shaped by a broad spectrum of opinions, experiences, and cultural perspectives. The integration of large language models (LLMs) in healthcare is new and requires an understanding of their consistent adherence to ethical standards. Objective: To compare the agreement rates in answering questions b...
Preprint
Full-text available
Background and Aim: This study evaluates the diagnostic performance of multimodal large language models (LLMs), GPT-4o and Claude Sonnet 3.5, in detecting glaucoma from fundus images. We specifically assess the impact of prompt engineering and the use of reference images on model performance. Methods: We utilized the ACRIMA public dataset, comprisi...
Preprint
Full-text available
Large language models (LLMs) are increasingly integrated into healthcare but concerns about potential sociodemographic biases persist. We aimed to assess biases in decision making by evaluating LLMs' responses to clinical scenarios across varied sociodemographic profiles. We utilized 500 emergency department vignettes, each representing the same cl...
Article
Full-text available
Multimodal technology is poised to revolutionize clinical practice by integrating artificial intelligence with traditional diagnostic modalities. This evolution traces its roots from Hippocrates’ humoral theory to the use of sophisticated AI-driven platforms that synthesize data across multiple sensory channels. The interplay between historical med...
Preprint
Background: Accurate medical coding is essential for clinical and administrative purposes but complicated, time-consuming, and biased. This study compares Retrieval-Augmented Generation (RAG)-enhanced LLMs to provider-assigned codes in producing ICD-10-CM codes from emergency department (ED) clinical records. Methods: Retrospective cohort study usi...
Article
Full-text available
Recent studies suggest that heparan sulfate proteoglycans (HSPG) contribute to the predisposition to, protection from, and potential treatment and prevention of Alzheimer’s disease (AD). Here, we used electronic health records (EHR) from two different health systems to examine whether heparin therapy was associated with a delayed diagnosis of AD de...
Preprint
Background and Aim: The potential of large language models (LLMs) like GPT-4 to generate clear and empathetic medical documentation is becoming increasingly relevant. This study evaluates these constructs in discharge letters generated by GPT-4 compared to those written by emergency department (ED) physicians. Methods: In this retrospective, blinde...
Preprint
Full-text available
Background and Aim: Vasculitides are rare inflammatory disorders that sometimes can be difficult to diagnose due to their diverse presentations. This review examines the use of Artificial Intelligence (AI) to improve diagnosis and outcome prediction in vasculitis. Methods: A systematic search of PubMed, Embase, Web of Science, IEEE Xplore, and Scop...
Article
Full-text available
Drug repurposing—identifying new therapeutic uses for approved drugs—is often a serendipitous and opportunistic endeavour to expand the use of drugs for new diseases. The clinical utility of drug-repurposing artificial intelligence (AI) models remains limited because these models focus narrowly on diseases for which some drugs already exist. Here w...
Preprint
Full-text available
Background: Large language models (LLMs) are gaining recognition across various medical fields; however, their specific role in dermatology, particularly in melanoma care, is not well-defined. This systematic review evaluates the current applications, advantages, and challenges associated with the use of LLMs in melanoma care. Methods: We conducted...
Preprint
In moral philosophy, two foundational approaches shape ethical decisions: deontology and utilitarianism, often exemplified by the "Trolley" dilemma. We conducted an experiment evaluating multiple LLMs, including OpenAI's GPT-o1, across five medical versions of this dilemma. While some models adhered to established ethical standards, others inconsis...
Article
Objectives Natural Language Processing (NLP) and Large Language Models (LLMs) have emerged as powerful tools in healthcare, offering advanced methods for analyzing unstructured clinical texts. This systematic review aims to evaluate the current applications of NLP and LLMs in rheumatology, focusing on their potential to improve disease detection, d...
Article
Full-text available
Objective The United States Medical Licensing Examination (USMLE) assesses physicians' competency. Passing this exam is required to practice medicine in the U.S. With the emergence of large language models (LLMs) like ChatGPT and GPT-4, understanding their performance on these exams illuminates their potential in medical education and healthcare....
Article
Full-text available
Large language models (LLMs) have significantly impacted various fields with their ability to understand and generate human‐like text. This study explores the potential benefits and limitations of integrating LLMs, such as ChatGPT, into haematology practices. Utilizing systematic review methodologies, we analysed studies published after 1 December...
Preprint
Full-text available
Background and Aim: Visual data from images is essential for many medical diagnoses. This study evaluates the performance of multimodal Large Language Models (LLMs) in integrating textual and visual information for diagnostic purposes. Methods: We tested GPT-4o and Claude Sonnet 3.5 on 120 clinical vignettes with and without accompanying images. Ea...
Preprint
Purpose This review analyzes the application of large language models (LLMs), in the field of cardiology, with a focus on evaluating their performances across various clinical tasks. Methods We conducted a systematic literature search on PubMed for studies published up to April 14, 2024. Our search used a wide range of keywords related to LLMs and...
Article
Full-text available
Objectives This study aims to assess the performance of a multimodal artificial intelligence (AI) model capable of analyzing both images and textual data (GPT-4V), in interpreting radiological images. It focuses on a range of modalities, anatomical regions, and pathologies to explore the potential of zero-shot generative AI in enhancing diagnostic...
Article
Background Predicting hospitalization from nurse triage notes has the potential to augment care. However, there needs to be careful considerations for which models to choose for this goal. Specifically, health systems will have varying degrees of computational infrastructure available and budget constraints. Objective To this end, we compared the...
Preprint
Full-text available
Background and Aim: Large language models (LLMs) show promise in healthcare, but their self-assessment capabilities remain unclear. This study evaluates the confidence levels and performance of 12 LLMs across five medical specialties to assess their ability to accurately judge their responses. Methods: We used 1965 multiple-choice questions from in...
Preprint
Full-text available
Rationale and Objectives: Over the past year, studies have been conducted to evaluate the performance of Large Language Models (LLMs), such as ChatGPT, in the fields of gynecologic oncology. This review aims to analyze the applications and risks associated with using LLMs in this specialized field. Materials and Methods: This systematic review was...
Article
Full-text available
This study was designed to assess how different prompt engineering techniques, specifically direct prompts, Chain of Thought (CoT), and a modified CoT approach, influence the ability of GPT-3.5 to answer clinical and calculation-based medical questions, particularly those styled like the USMLE Step 1 exams. To achieve this, we analyzed the response...
Preprint
Full-text available
Large Language Models (LLMs) are becoming integral to healthcare analytics. However, the influence of the temperature hyperparameter, which controls output randomness, remains poorly understood in clinical tasks. This study evaluates the effects of different temperature settings across various clinical tasks. We conducted a retrospective cohort stu...
Preprint
Full-text available
Background and Aim: In the last two years, natural language processing (NLP) has transformed significantly with the introduction of large language models (LLM). This review updates on NLP and LLM applications and challenges in gastroenterology and hepatology. Methods: Registered with PROSPERO (CRD42024542275) and adhering to PRISMA guidelines, we s...
Article
Artificial intelligence (AI) is an emerging technology with numerous healthcare applications. AI could prove particularly useful in the cardiac intensive care unit (CICU) where its capacity to analyze large datasets in real-time would assist clinicians in making more informed decisions. This systematic review aimed to explore current research on AI...
Article
Full-text available
Artificial Intelligence, specifically advanced language models such as ChatGPT, have the potential to revolutionize various aspects of healthcare, medical education, and research. In this narrative review, we evaluate the myriad applications of ChatGPT in diverse healthcare domains. We discuss its potential role in clinical decision-making, explori...
Preprint
Full-text available
Aim Diagnostic imaging is an integral part of identifying spondyloarthropathies (SpA), yet the interpretation of these images can be challenging. This review evaluated the use of deep learning models to enhance the diagnostic accuracy of SpA imaging. Methods Following PRISMA guidelines, we systematically searched major databases up to February 202...
Preprint
Background/Aim Contrast-enhanced mammography (CEM) is a relatively novel imaging technique that enables both anatomical and functional breast imaging, with improved diagnostic performance compared to standard 2D mammography. The aim of this study is to systematically review the literature on deep learning (DL) applications for CEM, exploring how th...
Preprint
Full-text available
Rationale and Objectives: Large Language Models (LLMs) have the potential to enhance medical training, education, and diagnosis. However, since these models were not originally designed for medical purposes, there are concerns regarding their reliability and safety in clinical settings. This review systematically assesses the utility, advantages, a...
Article
This paper explores the emerging role of large language models (LLMs) in healthcare, offering an analysis of their applications and limitations. Attention mechanisms and transformer architectures enable LLMs to perform tasks like extracting clinical information and assisting in diagnostics. We highlight research that demonstrates early application...
Article
Full-text available
Aim: This study aimed to identify and analyze the top 100 most cited digital health and mobile health (m-health) publications. It could aid researchers in the identification of promising new research avenues, additionally supporting the establishment of international scientific collaboration between interdisciplinary research groups with demonstrat...
Preprint
Full-text available
Importance: Infant alertness and neurologic changes are assessed by exam, which can be intermittent and subjective. Reliable, continuous methods are needed. Objective: We hypothesized that our computer vision method to track movement, pose AI, could predict neurologic changes. Design: Retrospective observational study from 2021-2022. Setting: A lev...
Article
Full-text available
Background Writing multiple choice questions (MCQs) for the purpose of medical exams is challenging. It requires extensive medical knowledge, time and effort from medical educators. This systematic review focuses on the application of large language models (LLMs) in generating medical MCQs. Methods The authors searched for studies published up to...
Article
Full-text available
Purpose Despite advanced technologies in breast cancer management, challenges remain in efficiently interpreting vast clinical data for patient-specific insights. We reviewed the literature on how large language models (LLMs) such as ChatGPT might offer solutions in this field. Methods We searched MEDLINE for relevant studies published before Dece...
Preprint
Full-text available
Background: With the advent of large language models (LLM), such as ChatGPT, natural language processing (NLP) is revolutionizing healthcare. We systematically reviewed NLP's role in rheumatology and assessed its impact on diagnostics, disease monitoring, and treatment strategies. Methods: Following PRISMA guidelines, we conducted a systematic sear...
Article
Artificial intelligence (AI) is a field of study that strives to replicate aspects of human intelligence into machines. Preventive cardiology, a subspeciality of cardiovascular (CV) medicine, aims to target and mitigate known risk factors for CV disease (CVD). AI's integration into preventive cardiology may introduce novel treatment interventions a...
Article
Full-text available
Artificial intelligence, specifically advanced language models such as ChatGPT, have the potential to revolutionize various aspects of healthcare, medical education, and research. In this review, we evaluate the myriad applications of artificial intelligence in diverse healthcare domains. We discuss its potential role in clinical decision-making, e...
Preprint
Ten years ago, it was predicted that the multi-omics revolution would also revolutionize space pharmacogenomics. Current barriers related to the findable, accessible, interoperable, and reproducible use of space-flown pharmaceutical data have contributed to a lack of progress beyond application of earth-based principles. To directly tackle these ch...
Preprint
Full-text available
Background: Natural Language Processing (NLP) and Large Language Models (LLMs) hold largely untapped potential in infectious disease management. This review explores their current use and uncovers areas needing more attention. Methods: This analysis followed systematic review procedures, registered with PROSPERO. We conducted a search across major...
Article
Full-text available
This review analyzes the most influential artificial intelligence (AI) studies in health and life sciences from the past three years, delineating the evolving role of AI in these fields. We identified and analyzed the top 50 cited articles on AI in biomedicine, revealing significant trends and thematic categorizations, including Drug Development, R...
Preprint
Full-text available
Background Writing multiple choice questions (MCQs) for the purpose of medical exams is challenging. It requires extensive medical knowledge, time and effort from medical educators. This systematic review focuses on the application of large language models (LLMs) in generating medical MCQs. Methods The authors searched for studies published up to N...
Preprint
Full-text available
Objectives Simplifying medical information to make it understandable for patients, specifically in the case of radiology reports, is challenging. It requires time and effort from medical personnel. This systematic review focuses on the application of large language models (LLMs) in generating simplified radiological imaging reports, as well as answ...
Preprint
Full-text available
Purpose Writing multiple choice questions (MCQs) for the purpose of medical exams is challenging. It requires extensive medical knowledge, time and effort from medical educators. This systematic review focuses on the application of large language models (LLMs) in generating medical MCQs. Methods The authors searched for studies published up to Nov...
Article
Full-text available
Target trial emulation is the process of mimicking target randomized trials using real-world data, where effective confounding control for unbiased treatment effect estimation remains a main challenge. Although various approaches have been proposed for this challenge, a systematic evaluation is still lacking. Here we emulated trials for thousands o...
Preprint
Objective Recent advancements in GPT-4 have enabled analysis of text with visual data. Diagnosis in ophthalmology is often based on ocular examinations and imaging, alongside the clinical context. The aim of this study was to evaluate the performance of multimodal GPT-4 (GPT-4V) in an integrated analysis of ocular images and clinical text. Methods...
Preprint
Full-text available
Objectives This study aims to assess the performance of OpenAI’s multimodal GPT-4, which can analyze both images and textual data (GPT-4V), in interpreting radiological images. It focuses on a range of modalities, anatomical regions, and pathologies to explore the potential of zero-shot generative-AI in enhancing diagnostic processes in radiology....
Preprint
Full-text available
Purpose: Recently introduced Large Language Models (LLMs) such as ChatGPT have already shown promising results in natural language processing in healthcare. The aim of this study is to systematically review the literature on the applications of LLMs in breast cancer diagnosis and care. Methods: A literature search was conducted using MEDLINE, focus...
Preprint
Objective: Large Language Models (LLMs) have demonstrated proficiency in free-text analysis in healthcare. With recent advancements, GPT-4 now has the capability to analyze both text and accompanying images. The aim of this study was to evaluate the performance of the multimodal GPT-4 in analyzing medical images using USMLE questions that incorpora...
Article
Full-text available
Background and Objectives: Since its invention in the 1970s, the cochlear implant (CI) has been substantially developed. We aimed to assess the trends in the published literature to characterize CI. Materials and Methods: We queried PubMed for all CI-related entries published during 1970–2022. The following data were extracted: year of publication,...