Curtis P Langlotz

Curtis P Langlotz
Stanford University | SU · Department of Radiology

MD, PhD

About

264
Publications
43,369
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
10,845
Citations
Citations since 2017
84 Research Items
7071 Citations
201720182019202020212022202305001,0001,500
201720182019202020212022202305001,0001,500
201720182019202020212022202305001,0001,500
201720182019202020212022202305001,0001,500
Additional affiliations
July 2004 - June 2014
University of Pennsylvania
Position
  • Professor and Vice Chair for Informatics

Publications

Publications (264)
Preprint
Full-text available
Importance A recently developed vision foundation model, "Segment Anything (SAM)," promises to segment any objects in images. However, the performance of SAM on clinical echocardiography images is yet to be investigated and compared against the domain-specific models. Objective To evaluate the performance of SAM on transthoracic echocardiography (T...
Preprint
Full-text available
Vision-language models (VLMs), such as CLIP and ALIGN, are generally trained on datasets consisting of image-caption pairs obtained from the web. However, real-world multimodal datasets, such as healthcare data, are significantly more complex: each image (e.g. X-ray) is often paired with text (e.g. physician report) that describes many distinct att...
Article
Full-text available
Objective To describe the infrastructure, tools, and services developed at Stanford Medicine to maintain its data science ecosystem and research patient data repository for clinical and translational research. Materials and Methods The data science ecosystem, dubbed the Stanford Data Science Resources (SDSR), includes infrastructure and tools to c...
Preprint
Interstitial lung diseases (ILD) present diagnostic challenges due to their varied manifestations and overlapping imaging features. To address this, we propose a machine learning approach that utilizes CLIP, a multimodal (image and text) self-supervised model, for ILD classification. We extensively integrate zero-shot CLIP throughout our workflow,...
Preprint
Full-text available
We systematically investigate lightweight strategies to adapt large language models (LLMs) for the task of radiology report summarization (RRS). Specifically, we focus on domain adaptation via pretraining (on natural language, biomedical text, and clinical text) and via prompting (zero-shot, in-context learning) or parameter-efficient fine-tuning (...
Preprint
Full-text available
Large-language models have recently demonstrated impressive zero-shot capabilities in a variety of natural language tasks such as summarization, dialogue generation, and question-answering. Despite many promising applications in clinical medicine, adoption of these models in real-world settings has been largely limited by their tendency to generate...
Article
Objective To develop an automated deidentification pipeline for radiology reports that detect protected health information (PHI) entities and replaces them with realistic surrogates “hiding in plain sight.” Materials and Methods In this retrospective study, 999 chest X-ray and CT reports collected between November 2019 and November 2020 were annot...
Preprint
Full-text available
Multimodal models trained on large natural image-text pair datasets have exhibited astounding abilities in generating high-quality images. Medical imaging data is fundamentally different to natural images, and the language used to succinctly capture relevant details in medical data uses a different, narrow but semantically rich, domain-specific voc...
Preprint
Full-text available
Radiology report summarization is a growing area of research. Given the Findings and/or Background sections of a radiology report, the goal is to generate a summary (called an Impression section) that highlights the key observations and conclusions of the radiology study. Recent efforts have released systems that achieve promising performance as me...
Article
Building a document-level classifier for COVID-19 on radiology reports could help assist providers in their daily clinical routine, as well as create large numbers of labels for computer vision models. We have developed such a classifier by fine-tuning a BERT-like model initialized from RadBERT, its continuous pre-training on radiology reports that...
Article
Full-text available
Advances in artificial intelligence (AI) and computer vision hold great promise for assisting medical staff, optimizing healthcare workflow, and improving patient outcomes. The COVID-19 pandemic, which caused unprecedented stress on healthcare systems around the world, presented what seems to be a perfect opportunity for AI to demonstrate its usefu...
Preprint
Full-text available
Neural image-to-text radiology report generation systems offer the potential to improve radiology reporting by reducing the repetitive process of report drafting and identifying possible medical errors. These systems have achieved promising performance as measured by widely used NLG metrics such as BLEU and CIDEr. However, the current systems face...
Preprint
Full-text available
Multi-modal foundation models are typically trained on millions of pairs of natural images and text captions, frequently obtained through web-crawling approaches. Although such models depict excellent generative capabilities, they do not typically generalize well to specific domains such as medical images that have fundamentally shifted distributio...
Article
Full-text available
In tasks involving the interpretation of medical images, suitably trained machine-learning models often exceed the performance of medical experts. Yet such a high-level of performance typically requires that the models be trained with relevant datasets that have been painstakingly annotated by experts. Here we show that a self-supervised model trai...
Preprint
Full-text available
The application of AI to medical image interpretation tasks has largely been limited to the identification of a handful of individual pathologies. In contrast, the generation of complete narrative radiology reports more closely matches how radiologists communicate diagnostic information in clinical workflows. Recent progress in artificial intellige...
Article
As the role of artificial intelligence (AI) in clinical practice evolves, governance structures oversee the implementation, maintenance, and monitoring of clinical AI algorithms to enhance quality, manage resources, and ensure patient safety. In this article, a framework is established for the infrastructure required for clinical AI implementation...
Article
The fact that medical images are still predominately exchanged between institutions via physical media is unacceptable in the era of value-driven health care. Although better solutions are technically possible, problems of coordination and market dynamics may be inhibiting progress more than technical factors. We provide a macrosystem analysis of t...
Article
The National Institutes of Health in 2018 identified key focus areas for the future of artificial intelligence in medical imaging, creating a foundational roadmap for research in image acquisition, algorithms, data standardization and translatable clinical decision support systems. Among the key issues raised in the report, data availability, the n...
Article
Background Previous studies suggest that use of artificial intelligence (AI) algorithms as diagnostic aids may improve the quality of skeletal age assessment, though these studies lack evidence from clinical practice. Purpose To compare the accuracy and interpretation time of skeletal age assessment on hand radiograph examinations with and without...
Article
Full-text available
Despite progressive improvements over the decades, the rich temporally resolved data in an echocardiogram remain underutilized. Human assessments reduce the complex patterns of cardiac wall motion, to a small list of measurements of heart function. All modern echocardiography artificial intelligence (AI) systems are similarly limited by design – au...
Preprint
Full-text available
Extracting structured clinical information from free-text radiology reports can enable the use of radiology report information for a variety of critical healthcare applications. In our work, we present RadGraph, a dataset of entities and relations in full-text chest X-ray radiology reports based on a novel information extraction schema we designed...
Article
Full-text available
Objective The study sought to develop and evaluate neural natural language processing (NLP) packages for the syntactic analysis and named entity recognition of biomedical and clinical English text. Materials and Methods We implement and train biomedical and clinical English NLP pipelines by extending the widely used Stanza library originally desig...
Article
Full-text available
Abstract Coronary artery disease (CAD), the most common manifestation of cardiovascular disease, remains the most common cause of mortality in the United States. Risk assessment is key for primary prevention of coronary events and coronary artery calcium (CAC) scoring using computed tomography (CT) is one such non-invasive tool. Despite the proven...
Article
Background Multicenter data on long term survival following LVAD implantation that make use of contemporary definitions of RV failure are limited. Furthermore, traditional survival analyses censor patients who receive a bridge to heart transplant. Here we compare the outcomes of LVAD patients who develop post-operative RV failure accounting for the...
Preprint
Integrating methods for time-to-event prediction with diagnostic imaging modalities is of considerable interest, as accurate estimates of survival requires accounting for censoring of individuals within the observation period. New methods for time-to-event prediction have been developed by extending the cox-proportional hazards model with neural ne...
Preprint
Advances in computing power, deep learning architectures, and expert labelled datasets have spurred the development of medical imaging artificial intelligence systems that rival clinical experts in a variety of scenarios. The National Institutes of Health in 2018 identified key focus areas for the future of artificial intelligence in medical imagin...
Article
Full-text available
Artificial intelligence (AI) algorithms continue to rival human performance on a variety of clinical tasks, while their actual impact on human diagnosticians, when incorporated into clinical workflows, remains relatively unexplored. In this study, we developed a deep learning-based assistant to help pathologists differentiate between two subtypes o...
Article
Introduction: Post-operative right ventricular failure (RV failure) is the single largest contributor to short-term mortality in patients with left ventricular assist devices (LVAD); yet predicting which patient is at risk of developing this complication in the pre-operative setting has remained beyond the abilities of experts in the field. We hypo...
Preprint
Neural image-to-text radiology report generation systems offer the potential to accelerate clinical processes by saving radiologists from the repetitive labor of drafting radiology reports and preventing medical errors. However, existing report generation systems, despite achieving high performances on natural language generation metrics such as CI...
Preprint
Learning visual representations of medical images is core to medical image understanding but its progress has been held back by the small size of hand-labeled datasets. Existing work commonly relies on transferring weights from ImageNet pretraining, which is suboptimal due to drastically different image characteristics, or rule-based label extracti...
Article
Although artificial intelligence (AI)-based algorithms for diagnosis hold promise for improving care, their safety and effectiveness must be ensured to facilitate wide adoption. Several recently proposed regulatory frameworks provide a solid foundation but do not address a number of issues that may prevent algorithms from being fully trusted. In th...
Preprint
Purpose: To develop and evaluate the accuracy of a multi-view deep learning approach to the analysis of high-resolution synthetic mammograms from digital breast tomosynthesis screening cases, and to assess the effect on accuracy of image resolution and training set size. Materials and Methods: In a retrospective study, 21,264 screening digital brea...
Article
Artificial intelligence algorithms based on principles of deep learning (DL) have made a large impact on the acquisition, reconstruction, and interpretation of MRI data. Despite the large number of retrospective studies using DL, there are fewer applications of DL in the clinic on a routine basis. To address this large translational gap, we review...
Preprint
We introduce biomedical and clinical English model packages for the Stanza Python NLP library. These packages offer accurate syntactic analysis and named entity recognition capabilities for biomedical and clinical text, by combining Stanza's fully neural architecture with a wide variety of open datasets as well as large-scale unsupervised biomedica...
Article
Full-text available
Accurate assessment of cardiac function is crucial for the diagnosis of cardiovascular disease¹, screening for cardiotoxicity² and decisions regarding the clinical management of patients with a critical illness³. However, human assessment of cardiac function focuses on a limited sampling of cardiac cycles and has considerable inter-observer variabi...
Article
In this article, the authors propose an ethical framework for using and sharing clinical data for the development of artificial intelligence (AI) applications. The philosophical premise is as follows: when clinical data are used to provide care, the primary purpose for acquiring the data is fulfilled. At that point, clinical data should be treated...
Article
Full-text available
The development of deep learning algorithms for complex tasks in digital medicine has relied on the availability of large labeled training datasets, usually containing hundreds of thousands of examples. The purpose of this study was to develop a 3D deep learning model, AppendiXNet, to detect appendicitis, one of the most common life-threatening abd...
Preprint
While artificial intelligence (AI) algorithms continue to rival human performance on a variety of clinical tasks, the question of how best to incorporate these algorithms into clinical workflows remains relatively unexplored. We investigated how AI can affect pathologist performance on the task of differentiating between two subtypes of primary liv...
Preprint
Full-text available
ive summarization models are able to generate summaries which have high overlap with human references. However, existing models are not optimized for factual correctness, a critical metric in real-world applications. In this work, we propose to evaluate the factual correctness of a generated summary by fact-checking it against its reference using a...
Preprint
Different convolutional neural network (CNN) models have been tested for their application in histologic imaging analyses. However, these models are prone to overfitting due to their large parameter capacity, requiring more data and expensive computational resources for model training. Given these limitations, we developed and tested PlexusNet for...
Article
Large, labeled datasets have driven deep learning methods to achieve expert-level performance on a variety of medical imaging tasks. We present CheXpert, a large dataset that contains 224,316 chest radiographs of 65,240 patients. We design a labeler to automatically detect the presence of 14 observations in radiology reports, capturing uncertaintie...
Article
Advances in machine learning in medical imaging are occurring at a rapid pace in research laboratories both at academic institutions and in industry. Important artificial intelligence (AI) tools for diagnostic imaging include algorithms for disease detection and classification, image optimization, radiation reduction, and workflow enhancement. Alth...
Article
Full-text available
Imaging research laboratories are rapidly creating machine learning systems that achieve expert human performance using open-source methods and tools. These artificial intelligence systems are being developed to improve medical image reconstruction, noise reduction, quality assurance, triage, segmentation, computer-aided detection, computer-aided c...
Article
The 2018 RSNA Summit on AI in Radiology brought together a diverse group of stakeholders to identify and prioritize areas of need related to artificial intelligence in radiology. This article presents the proceedings of the summit with emphasis on RSNA's role in leading, organizing, and catalyzing change during this important time in radiology. © R...
Preprint
Full-text available
Large, labeled datasets have driven deep learning methods to achieve expert-level performance on a variety of medical imaging tasks. We present CheXpert, a large dataset that contains 224,316 chest radiographs of 65,240 patients. We design a labeler to automatically detect the presence of 14 observations in radiology reports, capturing uncertaintie...
Article
Full-text available
Background Magnetic resonance imaging (MRI) of the knee is the preferred method for diagnosing knee injuries. However, interpretation of knee MRI is time-intensive and subject to diagnostic error and variability. An automated system for interpreting knee MRI could prioritize high-risk patients and assist clinicians in making diagnoses. Deep learnin...
Data
MRNet implementation for external validation. (ZIP)
Data
Magnetic resonance imaging settings and parameters for the Stanford musculoskeletal knee protocol. (DOCX)
Data
Comparison of individual unassisted and model-assisted clinical experts on the validation set. (DOCX)
Data
Comparison of unassisted and model-assisted performance metrics of clinical experts on the validation set. (DOCX)
Data
Sensitivity analysis: Comparison of unassisted and model-assisted performance metrics of general radiologists on the validation set. (DOCX)
Article
Objective: The purpose of this study is to determine whether the type of feedback on evidence-based guideline adherence influences adult primary care provider (PCP) lumbar spine (LS) MRI orders for low back pain (LBP). Materials and methods: Four types of guideline adherence feedback were tested on eight tertiary health care system outpatient PC...
Article
Full-text available
Background Chest radiograph interpretation is critical for the detection of thoracic diseases, including tuberculosis and lung cancer, which affect millions of people worldwide each year. This time-consuming task typically requires expert radiologists to read the images, leading to fatigue-based diagnostic error and lack of diagnostic expertise in...
Data
Performance measures of the algorithm and radiologists on the validation set for all other pathologies. Each plot shows the diagnostic measures of the algorithm (purple diamond), micro-average resident radiologist (unfilled orange diamond), micro-average BC radiologist (filled orange diamond), individual resident radiologists (unfilled green diamon...
Data
ChestX-ray14 training set label prevalence compared with algorithm performance. (DOCX)
Data
Inter-rater agreement of the 3 cardiothoracic specialist radiologists on the validation set. (DOCX)
Data
Performance measure values of the algorithm and radiologists on the references standard set for all pathologies. The “Resident radiologists” expert refers to the micro-average over the 3 resident radiologists and “BC radiologists” expert refers to the micro-average over the 6 board-certified radiologists. Individual estimates follow. (XLSX)
Data
Summary statistics of training, tuning, and validation datasets. (DOCX)
Data
Mean proportion correct over all pathologies on the validation set. (DOCX)
Data
Interpreting network predictions. The left image in each panel is the original radiograph with radiologist annotations (pink ovals) highlighting the abnormality in the radiograph; these indicators were not present when the images were input to the algorithm. The right image in each panel is the localization heatmap output by the algorithm overlayin...
Data
ChestX-ray14 label statistics and ChestX-ray14 label agreement with the validation set. (DOCX)
Article
Purpose To assess the ability of convolutional neural networks (CNNs) to enable high-performance automated binary classification of chest radiographs. Materials and Methods In a retrospective study, 216 431 frontal chest radiographs obtained between 1998 and 2012 were procured, along with associated text reports and a prospective label from the att...
Article
This paper explores cutting-edge deep learning methods for information extraction from medical imaging free text reports at a multi-institutional scale and compares them to the state-of-the-art domain-specific rule-based system – PEFinder and traditional machine learning methods – SVM and Adaboost. We proposed two distinct deep learning models – (i...
Preprint
Full-text available
The Impression section of a radiology report summarizes crucial radiology findings in natural language and plays a central role in communicating these findings to physicians. However, the process of generating impressions by summarizing findings is time-consuming for radiologists and prone to errors. We propose to automate the generation of radiolo...
Article
Rationale and objectives: To evaluate a natural language processing (NLP) system built with open-source tools for identification of lumbar spine imaging findings related to low back pain on magnetic resonance and x-ray radiology reports from four health systems. Materials and methods: We used a limited data set (de-identified except for dates) s...
Article
Purpose To evaluate the performance of a deep learning convolutional neural network (CNN) model compared with a traditional natural language processing (NLP) model in extracting pulmonary embolism (PE) findings from thoracic computed tomography (CT) reports from two institutions. Materials and Methods Contrast material-enhanced CT examinations of t...
Article
Purpose To compare the performance of a deep-learning bone age assessment model based on hand radiographs with that of expert radiologists and that of existing automated models. Materials and Methods The institutional review board approved the study. A total of 14 036 clinical hand radiographs and corresponding reports were obtained from two childr...