About
561
Publications
103,111
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
26,331
Citations
Introduction
Skills and Expertise
Publications
Publications (561)
Artificial intelligence (AI) shows potential to improve health care by leveraging data to build models that can inform clinical workflows. However, access to large quantities of diverse data is needed to develop robust generalizable models. Data sharing across institutions is not always feasible due to legal, security, and privacy concerns. Federat...
Background
Deep learning facilitates large-scale automated imaging evaluation of body composition. However, associations of body composition biomarkers with medical phenotypes have been underexplored. Phenome-wide association study (PheWAS) techniques search for medical phenotypes associated with biomarkers. A PheWAS integrating large-scale analysi...
A major barrier to deploying healthcare AI is trustworthiness. One form of trustworthiness is a model’s robustness across subgroups: while models may exhibit expert-level performance on aggregate metrics, they often rely on non-causal features, leading to errors in hidden subgroups. To take a step closer towards trustworthy seizure onset detection...
Current risk scores using clinical risk factors for predicting ischemic heart disease (IHD) events—the leading cause of global mortality—have known limitations and may be improved by imaging biomarkers. While body composition (BC) imaging biomarkers derived from abdominopelvic computed tomography (CT) correlate with IHD risk, they are impractical t...
512
Background: Machine learning models that predict survival time for patients with cancer can be useful in the clinic. Validating their performance in deployment is important but challenging, because the only timely source of follow-up/death data is the electronic medical record (EMR), which is known to under-capture deaths resulting in informati...
Background
Breast density is strongly associated with breast cancer risk. Fully automated quantitative density assessment methods have recently been developed that could facilitate large-scale studies, although data on associations with long-term breast cancer risk are limited. We examined LIBRA assessments and breast cancer risk and compared resul...
Purpose:
Our study investigates whether graph-based fusion of imaging data with non-imaging electronic health records (EHR) data can improve the prediction of the disease trajectories for patients with coronavirus disease 2019 (COVID-19) beyond the prediction performance of only imaging or non-imaging EHR data.
Approach:
We present a fusion fram...
A major barrier to deploying healthcare AI models is their trustworthiness. One form of trustworthiness is a model's robustness across different subgroups: while existing models may exhibit expert-level performance on aggregate metrics, they often rely on non-causal features, leading to errors in hidden subgroups. To take a step closer towards trus...
4133
Background: The evaluation of treatment (Tx) Rp in NETs using CT/MRI scans can be difficult. Previous studies, including the E2211, have shown improved progression-free survival (PFS) but no significant difference in Rp as measured by RECIST 1.1 (R1.1) Cr. Thus, making it difficult to determine Tx effectiveness with short-term imaging, further...
Preclinical imaging is a critical component in translational research with significant complexities in workflow and site differences in deployment. Importantly, the National Cancer Institute’s (NCI) precision medicine initiative emphasizes the use of translational co-clinical oncology models to address the biological and molecular bases of cancer p...
We propose a relational graph to incorporate clinical similarity between patients while building personalized clinical event predictors with a focus on hospitalized COVID-19 patients. Our graph formation process fuses heterogeneous data, i.e., chest X-rays as node features and non-imaging EHR for edge formation. While node represents a snap-shot in...
Co-clinical trials are the concurrent or sequential evaluation of therapeutics in both patients clinically and patient-derived xenografts (PDX) pre-clinically, in a manner designed to match the pharmacokinetics and pharmacodynamics of the agent(s) used. The primary goal is to determine the degree to which PDX cohort responses recapitulate patient c...
We propose the first Self-Attention Capsule Network that was designed to deal with unique core challenges of medical imaging, specifically for tissue classification. These challenges are - significant data heterogeneity with statistics variability across imaging domains, insufficient spatial context and local fine-grained details, and limited train...
Image augmentations are quintessential for effective visual representation learning across self-supervised learning techniques. While augmentation strategies for natural imaging have been studied extensively, medical images are vastly different from their natural counterparts. Thus, it is unknown whether common augmentation strategies employed in S...
Introduction:
Management of patients with brain metastases is often based on manual lesion detection and segmentation by an expert reader. This is a time- and labor-intensive process, and to that end, this work proposes an end-to-end deep learning segmentation network for a varying number of available MRI available sequences.
Methods:
We adapt a...
Reduction in 30-day readmission rate is an important quality factor for hospitals as it can reduce the overall cost of care and improve patient post-discharge outcomes. While deep-learning-based studies have shown promising empirical results, several limitations exist in prior models for hospital readmission prediction, such as: (a) only patients w...
The collection and curation of large-scale medical datasets from multiple institutions is essential for training accurate deep learning models, but privacy concerns often hinder data sharing. Federated learning (FL) is a promising solution that enables privacy-preserving collaborative learning among different institutions, but it generally suffers...
Continuous monitoring of trained ML models to determine when their predictions should and should not be trusted is essential for their safe deployment. Such a framework ought to be high-performing, explainable, post-hoc and actionable. We propose TRUST-LAPSE, a “mistrust” scoring framework for continuous model monitoring. We assess the trustworthin...
Multivariate signals are prevalent in various domains, such as healthcare, transportation systems, and space sciences. Modeling spatiotemporal dependencies in multivariate signals is challenging due to (1) long-range temporal dependencies and (2) complex spatial correlations between sensors. To address these challenges, we propose representing mult...
Purpose
To assess the performance of a machine learning model trained with contrast-enhanced CT-based radiomics features in distinguishing benign from malignant solid renal masses and to compare model performance with three abdominal radiologists.
Methods
Patients who underwent intra-operative ultrasound during a partial nephrectomy were identifie...
We propose a relational graph to incorporate clinical similarity between patients while building personalized clinical event predictors with a focus on hospitalized COVID-19 patients. Our graph formation process fuses heterogeneous data, i.e., chest X-rays as node features and non-imaging EHR for edge formation. While node represents a snap-shot in...
Attention--or attribution--maps methods are methods designed to highlight regions of the model's input that were discriminative for its predictions. However, different attention maps methods can highlight different regions of the input, with sometimes contradictory explanations for a prediction. This effect is exacerbated when the training set is s...
Self supervised contrastive learning based pretraining allows development of robust and generalized deep learning models with small, labeled datasets, reducing the burden of label generation. This paper aims to evaluate the effect of CL based pretraining on the performance of referrable vs non referrable diabetic retinopathy (DR) classification. We...
Continuous monitoring of trained ML models to determine when their predictions should and should not be trusted is essential for their safe deployment. Such a framework ought to be high-performing, explainable, post-hoc and actionable. We propose TRUST-LAPSE, a "mistrust" scoring framework for continuous model monitoring. We assess the trustworthin...
Federated learning is an emerging research paradigm for enabling collaboratively training deep learning models without sharing patient data. However, the data from different institutions are usually heterogeneous across institutions, which may reduce the performance of models trained using federated learning. In this study, we propose a novel heter...
Domain generalization in medical image classification is an important problem for trustworthy machine learning to be deployed in healthcare. We find that existing approaches for domain generalization which utilize ground-truth abnormality segmentations to control feature attributions have poor out-of-distribution (OOD) performance relative to the s...
PURPOSE
For real-world evidence, it is convenient to use routinely collected data from the electronic medical record (EMR) to measure survival outcomes. However, patients can become lost to follow-up, causing incomplete data and biased survival time estimates. We quantified this issue for patients with metastatic cancer seen in an academic health s...
Federated learning is an emerging research paradigm enabling collaborative training of machine learning models among different organizations while keeping data private at each institution. Despite recent progress, there remain fundamental challenges such as the lack of convergence and the potential for catastrophic forgetting across real-world hete...
Purpose:
To develop a deep learning-based risk stratification system for thyroid nodules using US cine images.
Materials and methods:
In this retrospective study, 192 biopsy-confirmed thyroid nodules (175 benign, 17 malignant) in 167 unique patients (mean age, 56 years ± 16 [SD], 137 women) undergoing cine US between April 2017 and May 2018 with...
Measures to predict 30-day readmission are considered an important quality factor for hospitals as accurate predictions can reduce the overall cost of care by identifying high risk patients before they are discharged. While recent deep learning-based studies have shown promising empirical results on readmission prediction, several limitations exist...
[This corrects the article DOI: 10.1038/s42256-021-00421-z.].
Objective or Purpose
To utilize a deep learning (DL) model trained via federated learning (FL), a method of collaborative training without sharing patient data, to delineate institutional differences in clinician diagnostic paradigms and disease epidemiology in retinopathy of prematurity (ROP).
Design
Evaluation of a diagnostic test or technology...
Objective
To compare the performance of deep learning (DL) classifiers for the diagnosis of plus disease in retinopathy of prematurity (ROP) trained using two methods of developing models on multi-institutional datasets: centralizing data versus federated learning (FL) where no data leaves each institution.
Design
Evaluation of a diagnostic test o...
Collaborative learning, which enables collaborative and decentralized training of deep neural networks at multiple institutions in a privacy-preserving manner, is rapidly emerging as a valuable technique in healthcare applications. However, its distributed nature often leads to significant heterogeneity in data distributions across institutions. In...
Background
Transfer learning is a common practice in image classification with deep learning where the available data is often limited for training a complex model with millions of parameters. However, transferring language models requires special attention since cross-domain vocabularies (e.g. between two different modalities MR and US) do not alw...
Radiology reports are a rich resource for advancing deep learning applications for medical images, facilitating the generation of large-scale annotated image databases. Although the ambiguity and subtlety of natural language poses a significant challenge to information extraction from radiology reports. Thyroid Imaging Reporting and Data Systems (T...
Scoliosis is a condition of abnormal lateral spinal curvature affecting an estimated 2 to 3% of the US population, or seven million people. The Cobb angle is the standard measurement of spinal curvature in scoliosis but is known to have high interobserver and intraobserver variability. Thus, the objective of this study was to build and validate a s...
Purpose:
To automatically identify a cohort of patients with pancreatic cystic lesions (PCLs) and extract PCL measurements from historical CT and MRI reports using natural language processing (NLP) and a question answering system.
Materials and methods:
Institutional review board approval was obtained for this retrospective Health Insurance Port...
Purpose: This study investigates whether graph-based fusion of imaging data with non-imaging EHR data can improve the prediction of disease trajectory for COVID-19 patients, beyond the prediction performance of only imaging or non-imaging EHR data.
Materials and Methods: We present a novel graph-based framework for fine-grained clinical outcome pre...
Abstract The purpose of this study was to assess the clinical value of a deep learning (DL) model for automatic detection and segmentation of brain metastases, in which a neural network is trained on four distinct MRI sequences using an input-level dropout layer, thus simulating the scenario of missing MRI sequences by training on the full set and...
With clinical trials unable to detect all potential adverse reactions to drugs and medical devices prior to their release into the market, accurate post-market surveillance is critical to ensure their safety and efficacy. Electronic health records (EHR) contain rich observational patient data, making them a valuable source to actively monitor the s...
In a clinical setting, epilepsy patients are monitored via video electroencephalogram (EEG) tests. A video EEG records what the patient experiences on videotape while an EEG device records their brainwaves. Currently, there are no existing automated methods for tracking the patient's location during a seizure, and video recordings of hospital patie...
Artificial intelligence (AI) provides a promising substitution for streamlining COVID-19 diagnoses. However, concerns surrounding security and trustworthiness impede the collection of large-scale representative medical data, posing a considerable challenge for training a well-generalised model in clinical practices. To address this, we launch the U...
Background
The Mortality Probability Model (MPM) is used in research and quality improvement to adjust for severity of illness and can also inform triage decisions. However, a limitation for its automated use or application is that it includes the variable “intracranial mass effect” (IME), which requires human engagement with the electronic health...
Deep learning models have demonstrated favorable performance on many medical image classification tasks. However, they rely on expensive hand-labeled datasets that are time-consuming to create. In this work, we explore a new supervision source to training deep learning models by using gaze data that is passively and cheaply collected during a clini...
Age-related macular degeneration (AMD) is a leading cause of severe vision loss. With our aging population, it may affect 288 million people globally by the year 2040. AMD progresses from an early and intermediate dry form to an advanced one, which manifests as choroidal neovascularization and geographic atrophy. Conversion to AMD-related exudation...
Purpose
Magnetic resonance (MR) imaging is an essential diagnostic tool in clinical medicine. Recently, a variety of deep‐learning methods have been applied to segmentation tasks in medical images, with promising results for computer‐aided diagnosis. For MR images, effectively integrating different pulse sequences is important to optimize performan...
Suboptimal generalization of machine learning models on unseen data is a key challenge which hampers the clinical applicability of such models to medical imaging. Although various methods such as domain adaptation and domain generalization have evolved to combat this challenge, learning robust and generalizable representations is core to medical im...
Federated learning enables multiple institutions to collaboratively train machine learning models on their local data in a privacy-preserving way. However, its distributed nature often leads to significant heterogeneity in data distributions across institutions. In this paper, we investigate the deleterious impact of a taxonomy of data heterogeneit...
Triple-negative breast cancer, the poorest-prognosis breast cancer subtype, lacks clinically approved biomarkers for patient risk stratification and treatment management. Prior literature has shown that interrogation of the tumor-immune microenvironment may be a promising approach to fill these gaps. Recently developed high-dimensional tissue imagi...
Federated learning is an emerging research paradigm for enabling collaboratively training deep learning models without sharing patient data. However, the data from different institutions are usually heterogeneous across institutions, which may reduce the performance of models trained using federated learning. In this study, we propose a novel heter...
Purpose:
To probabilistically forecast needed anti-vascular endothelial growth factor (anti-VEGF) treatment frequency using volumetric spectral domain-optical coherence tomography (SD-OCT) biomarkers in neovascular age-related macular degeneration from real-world settings.
Methods:
SD-OCT volume scans were segmented with a custom deep-learning-b...
Optimization plays a key role in the training of deep neural networks. Deciding when to stop training can have a substantial impact on the performance of the network during inference. Under certain conditions, the generalization error can display a double descent pattern during training: the learning curve is non-monotonic and seemingly diverges be...
Images are pervasive in biomedicine, providing key information used for understanding the phenotype of disease. Biomedical imaging informatics is a field that involves computational methods related to acquisition, processing, and analysis of images in biomedicine. Major topics in biomedical imaging informatics follow the life cycle of images in the...
1535
Background: The COVID-19 pandemic affected oncology practice in ways that are still evolving. In particular, COVID-19 has led to changes in cancer treatment for patients (pts) infected with COVID, which may have long-term implications for both COVID and cancer-related outcomes. In this retrospective analysis, we describe changes in cancer mana...
Chest X-rays of coronavirus disease 2019 (COVID-19) patients are frequently obtained to determine the extent of lung disease and are a valuable source of data for creating artificial intelligence models. Most work to date assessing disease severity on chest imaging has focused on segmenting computed tomography (CT) images; however, given that CTs a...
Efficient prediction of cancer recurrence in advance may help to recruit high risk breast cancer patients for clinical trial on-time and can guide a proper treatment plan. Several machine learning approaches have been developed for recurrence prediction in previous studies, but most of them use only structured electronic health records and only a s...
Purpose:
To develop a convolutional neural network (CNN) to triage head CT (HCT) studies and investigate the effect of upstream medical image processing on the CNN's performance.
Materials and methods:
A total of 9776 HCT studies were retrospectively collected from 2001 through 2014, and a CNN was trained to triage them as normal or abnormal. CN...
The reliability of machine learning models can be compromised when trained on low quality data. Many large-scale medical imaging datasets contain low quality labels extracted from sources such as medical reports. Moreover, images within a dataset may have heterogeneous quality due to artifacts and biases arising from equipment or measurement errors...
Automated seizure detection and classification from electroencephalography (EEG) can greatly improve the diagnosis and treatment of seizures. While prior studies mainly used convolutional neural networks (CNNs) that assume image-like structure in EEG signals or spectrograms, this modeling choice does not reflect the natural geometry of or connectiv...
PURPOSE
Knowing the treatments administered to patients with cancer is important for treatment planning and correlating treatment patterns with outcomes for personalized medicine study. However, existing methods to identify treatments are often lacking. We develop a natural language processing approach with structured electronic medical records and...
Purpose:
Large-scale analysis of real-world evidence is often limited to structured data fields that do not contain reliable information on recurrence status and disease sites. In this report, we describe a natural language processing (NLP) framework that uses data from free-text, unstructured reports to classify recurrence status and sites of rec...
Model brittleness is a key concern when deploying deep learning models in real-world medical settings. A model that has high performance at one institution may suffer a significant decline in performance when tested at other institutions. While pooling datasets from multiple institutions and retraining may provide a straightforward solution, it is...
Sensitive and robust outcome measures of retinal function are pivotal for clinical trials in age-related macular degeneration (AMD). A recent development is the implementation of artificial intelligence (AI) to infer results of psychophysical examinations based on findings derived from multimodal imaging. We conducted a review of the current litera...
In medicine, randomized clinical trials (RCT) are the gold standard for informing treatment decisions. Observational comparative effectiveness research (CER) is often plagued by selection bias, and expert-selected covariates may not be sufficient to adjust for confounding. We explore how the unstructured clinical text in electronic medical records...
Suboptimal generalization of machine learning models on unseen data is a key challenge which hampers the clinical applicability of such models to medical imaging. Although various methods such as domain adaptation and domain generalization have evolved to combat this challenge, learning robust and generalizable representations is core to medical im...
The automated prediction of geographic atrophy (GA) lesion growth can help ophthalmologists understand how the GA progresses, and assess the efficiency of current treatment and the prognosis of the disease. We developed an integrated time adaptive prediction model for identifying the location of future GA growth. The proposed model was comprised of...
Recurrence risk stratification of patients undergoing primary surgical resection for hepatocellular carcinoma (HCC) is an area of active investigation, and several staging systems have been proposed to optimize treatment strategies. However, as many as 70% of patients still experience tumor recurrence at 5 years post-surgery. We developed and valid...
Triple-negative breast cancer (TNBC), the poorest-prognosis breast cancer subtype, lacks clinically approved biomarkers for patient risk stratification, treatment management, and immunotherapies. Prior literature has shown that interrogation of the tumor-immune microenvironment (TIME) may be a promising approach for the discovery of novel biomarker...