Mathias Unberath

Mathias Unberath
Johns Hopkins University | JHU · Laboratory for Computational Sensing and Robotics

PhD

About

311
Publications
47,439
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
4,714
Citations
Introduction
I lead the Advanced Robotics and Computationally AugmenteD Environments (ARCADE) lab. Our research is at the intersection of computer vision, machine learning, augmented reality, robotics, and medical imaging to develop collaborative systems that advance decision making across the healthcare spectrum.
Additional affiliations
October 2018 - present
Johns Hopkins University
Position
  • Professor
Description
  • Affiliated with the Malonce Center for Engineering in Healthcare and the Laboratory for Computational Sensing and Robotics.
June 2017 - October 2018
Johns Hopkins University
Position
  • PostDoc Position
November 2014 - May 2017
Friedrich-Alexander-University Erlangen-Nürnberg
Position
  • PhD Student
Education
November 2014 - May 2017
Friedrich-Alexander-University Erlangen-Nürnberg
Field of study
  • Computer Science, Pattern Recognition Lab
March 2014 - October 2014
Stanford University
Field of study
  • Radiology
October 2012 - October 2014
Friedrich-Alexander-University Erlangen-Nürnberg
Field of study
  • Advanced Optical Technologies

Publications

Publications (311)
Article
Full-text available
Leveraging Artificial Intelligence (AI) in decision support systems has disproportionately focused on technological advancements, often overlooking the alignment between algorithmic outputs and human expectations. A human-centered perspective attempts to alleviate this concern by designing AI solutions for seamless integration with existing process...
Preprint
Vertebral compression fractures (VCFs) are a common and potentially serious consequence of osteoporosis. Yet, they often remain undiagnosed. Opportunistic screening, which involves automated analysis of medical imaging data acquired primarily for other purposes, is a cost-effective method to identify undiagnosed VCFs. In high-stakes scenarios like...
Preprint
Full-text available
Segment Anything Models (SAMs) have gained increasing attention in medical image analysis due to their zero-shot generalization capability in segmenting objects of unseen classes and domains when provided with appropriate user prompts. Addressing this performance gap is important to fully leverage the pre-trained weights of SAMs, particularly in th...
Preprint
Full-text available
Natural language offers a convenient, flexible interface for controlling robotic C-arm X-ray systems, making advanced functionality and controls accessible. However, enabling language interfaces requires specialized AI models that interpret X-ray images to create a semantic representation for reasoning. The fixed outputs of such AI models limit the...
Article
Full-text available
In percutaneous pelvic trauma surgery, accurate placement of Kirschner wires (K‐wires) is crucial to ensure effective fracture fixation and avoid complications due to breaching the cortical bone along an unsuitable trajectory. Surgical navigation via mixed reality (MR) can help achieve precise wire placement in a low‐profile form factor. Current ap...
Preprint
Surgical phase recognition is essential for analyzing procedure-specific surgical videos. While recent transformer-based architectures have advanced sequence processing capabilities, they struggle with maintaining consistency across lengthy surgical procedures. Drawing inspiration from classical hidden Markov models' finite-state interpretations, w...
Preprint
Full-text available
Robustness audits of deep neural networks (DNN) provide a means to uncover model sensitivities to the challenging real-world imaging conditions that significantly degrade DNN performance in-the-wild. Such conditions are often the result of the compounding of multiple factors inherent to the environment, sensor, or processing pipeline and may lead t...
Preprint
Purpose: Surgical phase recognition (SPR) is an integral component of surgical data science, enabling high-level surgical analysis. End-to-end trained neural networks that predict surgical phase directly from videos have shown excellent performance on benchmarks. However, these models struggle with robustness due to non-causal associations in the t...
Article
Full-text available
Introduction Effective delivery of healthcare depends on timely and accurate triage decisions, directing patients to appropriate care pathways and reducing unnecessary visits. Artificial Intelligence (AI) solutions, particularly those based on Large Language Models (LLMs), may enable non-experts to make better triage decisions at home, thus easing...
Preprint
Objective Multiple studies have attempted to generate visual field (VF) mean deviation (MD) estimates using cross-sectional optical coherence tomography (OCT) data. However, whether such models offer any value in detecting longitudinal VF progression is unclear. We address this by developing a machine learning (ML) model to convert OCT data to MD a...
Preprint
Full-text available
Robotic planning and execution in open-world environments is a complex problem due to the vast state spaces and high variability of task embodiment. Recent advances in perception algorithms, combined with Large Language Models (LLMs) for planning, offer promising solutions to these challenges, as the common sense reasoning capabilities of LLMs prov...
Preprint
Full-text available
Arthroscopy is a minimally invasive surgical procedure used to diagnose and treat joint problems. The clinical workflow of arthroscopy typically involves inserting an arthroscope into the joint through a small incision, during which surgeons navigate and operate largely by relying on their visual assessment through the arthroscope. However, the art...
Preprint
Full-text available
In percutaneous pelvic trauma surgery, accurate placement of Kirschner wires (K-wires) is crucial to ensure effective fracture fixation and avoid complications due to breaching the cortical bone along an unsuitable trajectory. Surgical navigation via mixed reality (MR) can help achieve precise wire placement in a low-profile form factor. Current ap...
Article
Full-text available
Segment anything models (SAMs) are gaining attention for their zero-shot generalization capability in segmenting objects of unseen classes and in unseen domains when properly prompted. Interactivity is a key strength of SAMs, allowing users to iteratively provide prompts that specify objects of interest to refine outputs. However, to realize the in...
Preprint
Full-text available
Large language model-based (LLM) agents are emerging as a powerful enabler of robust embodied intelligence due to their capability of planning complex action sequences. Sound planning ability is necessary for robust automation in many task domains, but especially in surgical automation. These agents rely on a highly detailed natural language repres...
Preprint
Full-text available
Fully supervised deep learning (DL) models for surgical video segmentation have been shown to struggle with non-adversarial, real-world corruptions of image quality including smoke, bleeding, and low illumination. Foundation models for image segmentation, such as the segment anything model (SAM) that focuses on interactive prompt-based segmentation...
Article
Full-text available
Background Artificial intelligence-based (AI) clinical decision support systems (CDSS) using unconventional data, like smartphone-acquired images, promise transformational opportunities for telehealth; including remote diagnosis. Although such solutions’ potential remains largely untapped, providers’ trust and understanding are vital for effective...
Article
Full-text available
Pelvic ring disruptions result from blunt injury mechanisms and are potentially lethal mainly due to associated injuries and massive pelvic hemorrhage. The severity of pelvic fractures in trauma victims is frequently assessed by grading the fracture according to the Tile AO/OTA classification in whole-body Computed Tomography (CT) scans. Due to the...
Preprint
Full-text available
Accurate segmentation of anatomical structures and pathological regions in medical images is crucial for diagnosis, treatment planning, and disease monitoring. While the Segment Anything Model (SAM) and its variants have demonstrated impressive interactive segmentation capabilities on image types not seen during training without the need for domain...
Preprint
Full-text available
Brain tumor analysis in Magnetic Resonance Imaging (MRI) is crucial for accurate diagnosis and treatment planning. However, the task remains challenging due to the complexity and variability of tumor appearances, as well as the scarcity of labeled data. Traditional approaches often address tumor segmentation and image generation separately, limitin...
Article
Cardiovascular disease (CVD) is a major cause of mortality worldwide, especially in resource-limited countries with limited access to healthcare resources. Early detection and accurate imaging are vital for managing CVD, emphasizing the significance of patient education. Generative AI, including algorithms to synthesize text, speech, images, and co...
Preprint
Full-text available
Accurate segmentation of tools in robot-assisted surgery is critical for machine perception, as it facilitates numerous downstream tasks including augmented reality feedback. While current feed-forward neural network-based methods exhibit excellent segmentation performance under ideal conditions, these models have proven susceptible to even minor c...
Article
Full-text available
Surgical data science is devoted to enhancing the quality, safety, and efficacy of interventional healthcare. While the use of powerful machine learning algorithms is becoming the standard approach for surgical data science, the underlying end-to-end task models directly infer high-level concepts (e.g., surgical phase or skill) from low-level obser...
Article
Machine learning (ML) and deep learning (DL) have potential applications in medicine. This overview explores the applications of AI in cardiovascular imaging, focusing on echocardiography, cardiac MRI (CMR), coronary CT angiography (CCTA), and CT morphology and function. AI, particularly DL approaches like convolutional neural networks, enhances st...
Preprint
Full-text available
Primary care providers are vital for initial triage and referrals to specialty care. In glaucoma, asymptomatic and fast progression can lead to vision loss, necessitating timely referrals to specialists. However, primary eye care providers may not identify urgent cases, potentially delaying care. Artificial Intelligence (AI) offering explanations c...
Article
Monocular SLAM algorithms are the key enabling technology for image-based surgical navigation systems for endoscopic procedures. Due to the visual feature scarcity and unique lighting conditions encountered in endoscopy, classical SLAM approaches perform inconsistently. Many of the recent approaches to endoscopic SLAM rely on deep learning models....
Article
Preoperative imaging plays a pivotal role in sinus surgery where CTs offer patient-specific insights of complex anatomy, enabling real-time intraoperative navigation to complement endoscopy imaging. However, surgery elicits anatomical changes not represented in the preoperative model, generating an inaccurate basis for navigation during surgery pro...
Article
Full-text available
Purpose Interventional Cone‐Beam CT (CBCT) offers 3D visualization of soft‐tissue and vascular anatomy, enabling 3D guidance of abdominal interventions. However, its long acquisition time makes CBCT susceptible to patient motion. Image‐based autofocus offers a suitable platform for compensation of deformable motion in CBCT, but it relies on handcra...
Article
Full-text available
Purpose: Specialized robotic and surgical tools are increasing the complexity of operating rooms (ORs), requiring elaborate preparation especially when techniques or devices are to be used for the first time. Spatial planning can improve efficiency and identify procedural obstacles ahead of time, but real ORs offer little availability to optimize s...
Article
Eye gaze tracking and pupillometry are evolving areas within the field of tele-robotic surgery, particularly in the context of estimating cognitive load (CL). However, this is a recent field, and current solutions for gaze and pupil tracking in robotic surgery require assessment. Considering the necessity of stable pupillometry signals for reliable...
Article
Objective Obtaining automated, objective 3‐dimensional (3D) models of the Eustachian tube (ET) and the internal carotid artery (ICA) from computed tomography (CT) scans could provide useful navigational and diagnostic information for ET pathologies and interventions. We aim to develop a deep learning (DL) pipeline to automatically segment the ET an...
Article
Teamwork in surgery depends on a shared mental model of success, i.e., a common understanding of objectives in the operating room. A shared model leads to increased engagement among team members and is associated with fewer complications and overall better outcomes for patients. However, clinical training typically focuses on role-specific skills,...
Article
Full-text available
The expanding capabilities of surgical systems bring with them increasing complexity in the interfaces that humans use to control them. Robotic C-arm X-ray imaging systems, for instance, often require manipulation of independent axes via joysticks, while higher-level control options hide inside device-specific menus. The complexity of these interfa...
Article
Gaze tracking and pupillometry are established proxies for cognitive load, giving insights into a user’s mental effort. In tele-robotic surgery, knowing a user’s cognitive load can inspire novel human–machine interaction designs, fostering contextual surgical assistance systems and personalized training programs. While pupillometry-based methods fo...
Article
Robust and accurate eye gaze tracking can advance medical telerobotics by providing complementary data for surgical training, interactive instrument control, and augmented human–robot interactions. However, current gaze tracking solutions for systems such as the da Vinci Surgical System (dVSS) are limited to complex hardware installations. Addition...
Article
Full-text available
This perspective outlines the Artificial Intelligence and Technology Collaboratories (AITC) at Johns Hopkins University, University of Pennsylvania, and University of Massachusetts, highlighting their roles in developing AI‐based technologies for older adult care, particularly targeting Alzheimer's disease (AD). These National Institute on Aging (N...
Article
The AAST Organ Injury Scale is widely adopted for splenic injury severity but suffers from only moderate inter-rater agreement. This work assesses SpleenPro, a prototype interactive explainable artificial intelligence/machine learning (AI/ML) diagnostic aid to support AAST grading, for effects on radiologist dwell time, agreement, clinical utility,...
Article
Full-text available
Linear regression of optical coherence tomography measurements of peripapillary retinal nerve fiber layer thickness is often used to detect glaucoma progression and forecast future disease course. However, current measurement frequencies suggest that clinicians often apply linear regression to a relatively small number of measurements (e.g., less t...
Article
Full-text available
To develop and evaluate the performance of a deep learning model (DLM) that predicts eyes at high risk of surgical intervention for uncontrolled glaucoma based on multimodal data from an initial ophthalmology visit. Longitudinal, observational, retrospective study. 4898 unique eyes from 4038 adult glaucoma or glaucoma-suspect patients who underwent...
Article
Full-text available
Human cognition relies on embodiment as a fundamental mechanism. Virtual avatars allow users to experience the adaptation, control, and perceptual illusion of alternative bodies. Although virtual bodies have medical applications in motor rehabilitation and therapeutic interventions, their potential for learning anatomy and medical communication rem...
Article
Full-text available
Background Reproducible approaches are needed to bring AI/ML for medical image analysis closer to the bedside. Investigators wishing to shadow test cross-sectional medical imaging segmentation algorithms on new studies in real-time will benefit from simple tools that integrate PACS with on-premises image processing, allowing visualization of DICOM-...
Chapter
Full-text available
The synergy of long-range dependencies from transformers and local representations of image content from convolutional neural networks (CNNs) has led to advanced architectures and increased performance for various medical image analysis tasks due to their complementary benefits. However, compared with CNNs, transformers require considerably more tr...
Chapter
Full-text available
Nuclei appear small in size, yet, in real clinical practice, the global spatial information and correlation of the color or brightness contrast between nuclei and background, have been considered a crucial component for accurate nuclei segmentation. However, the field of automatic nuclei segmentation is dominated by Convolutional Neural Networks (C...
Chapter
Surgical phase recognition (SPR) is a crucial element in the digital transformation of the modern operating theater. While SPR based on video sources is well-established, incorporation of interventional X-ray sequences has not yet been explored. This paper presents Pelphix, a first approach to SPR for X-ray-guided percutaneous pelvic fracture fixat...
Chapter
To safely deploy deep learning-based computer vision models for computer-aided detection and diagnosis, we must ensure that they are robust and reliable. Towards that goal, algorithmic auditing has received substantial attention. To guide their audit procedures, existing methods rely on heuristic approaches or high-level objectives (e.g., non-discr...
Article
Full-text available
Background: Eye gaze tracking and pupillometry are emerging topics in telerobotic surgery as it is believed that they will enable novel gaze-based interaction paradigms and provide insights into the user’s cognitive load (CL). Further, the successful integration of CL estimation into telerobotic systems is thought to catalyze the development of new...
Preprint
Full-text available
Object-centric representation learning offers the potential to overcome limitations of image-level representations by explicitly parsing image scenes into their constituent components. While image-level representations typically lack robustness to natural image corruptions, the robustness of object-centric methods remains largely untested. To addre...
Article
Image-based 2D/3D registration is a critical technique for fluoroscopic guided surgical interventions. Conventional intensity-based 2D/3D registration approaches suffer from a limited capture range due to the presence of local minima in hand-crafted image similarity functions. In this work, we aim to extend the 2D/3D registration capture range with...
Preprint
Full-text available
Nuclei appear small in size, yet, in real clinical practice, the global spatial information and correlation of the color or brightness contrast between nuclei and background, have been considered a crucial component for accurate nuclei segmentation. However, the field of automatic nuclei segmentation is dominated by Convolutional Neural Networks (C...
Article
Full-text available
Background: precision-medicine quantitative tools for cross-sectional imaging require painstaking labeling of targets that vary considerably in volume, prohibiting scaling of data annotation efforts and supervised training to large datasets for robust and generalizable clinical performance. A straight-forward time-saving strategy involves manual e...
Preprint
Full-text available
Importance: Ultra-widefield fundus photography (UWF-FP) has shown utility in sickle cell retinopathy screening; however, image artifact may diminish quality and gradeability of images. Objective: To create an automated algorithm for UWF-FP artifact classification. Design: A neural network based automated artifact detection algorithm was designed to...
Preprint
Neural surface reconstruction has been shown to be powerful for recovering dense 3D surfaces via image-based neural rendering. However, current methods struggle to recover detailed structures of real-world scenes. To address the issue, we present Neuralangelo, which combines the representation power of multi-resolution 3D hash grids with neural sur...
Article
Full-text available
Purpose: Tracking the 3D motion of the surgical tool and the patient anatomy is a fundamental requirement for computer-assisted skull-base surgery. The estimated motion can be used both for intra-operative guidance and for downstream skill analysis. Recovering such motion solely from surgical videos is desirable, as it is compliant with current cl...