Mathias Unberath

Mathias Unberath
Johns Hopkins University | JHU · Laboratory for Computational Sensing and Robotics

PhD

About

184
Publications
25,155
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,496
Citations
Introduction
I lead the Advanced Robotics and Computationally AugmenteD Environments (ARCADE) lab. Our research is at the intersection of computer vision, machine learning, augmented reality, robotics, and medical imaging to develop collaborative systems that advance decision making across the healthcare spectrum.
Additional affiliations
October 2018 - present
Johns Hopkins University
Position
  • Professor
Description
  • Affiliated with the Malonce Center for Engineering in Healthcare and the Laboratory for Computational Sensing and Robotics.
June 2017 - October 2018
Johns Hopkins University
Position
  • PostDoc Position
January 2017 - April 2017
Johns Hopkins University
Position
  • Visiting Researcher
Education
November 2014 - May 2017
Friedrich-Alexander-University of Erlangen-Nürnberg
Field of study
  • Computer Science, Pattern Recognition Lab
March 2014 - October 2014
Stanford University
Field of study
  • Radiology
October 2012 - October 2014
Friedrich-Alexander-University of Erlangen-Nürnberg
Field of study
  • Advanced Optical Technologies

Publications

Publications (184)
Preprint
Full-text available
Artificial intelligence (AI) now enables automated interpretation of medical images for clinical use. However, AI's potential use for interventional images (versus those involved in triage or diagnosis), such as for guidance during surgery, remains largely untapped. This is because surgical AI systems are currently trained using post hoc analysis o...
Article
Purpose: Patient motion artifacts present a prevalent challenge to image quality in interventional cone-beam CT (CBCT). We propose a novel reference-free similarity metric (DL-VIF) that leverages the capability of deep convolutional neural networks (CNN) to learn features associated with motion artifacts within realistic anatomical features. DL-VI...
Article
Purpose: The purpose of this study was to accurately forecast future reliable visual field (VF) mean deviation (MD) values by correcting for poor reliability. Methods: Four linear regression techniques (standard, unfiltered, corrected, and weighted) were fit to VF data from 5939 eyes with a final reliable VF. For each eye, all VFs, except the fi...
Preprint
Full-text available
Vision-based segmentation of the robotic tool during robot-assisted surgery enables downstream applications, such as augmented reality feedback, while allowing for inaccuracies in robot kinematics. With the introduction of deep learning, many methods were presented to solve instrument segmentation directly and solely from images. While these approa...
Article
Full-text available
Purpose: Mixed reality (MR) for image-guided surgery may enable unobtrusive solutions for precision surgery. To display preoperative treatment plans at the correct physical position, it is essential to spatially align it with the patient intra-operatively. Accurate alignment is safety critical because it will guide treatment, but cannot always be...
Preprint
Full-text available
In endoscopy, many applications (e.g., surgical navigation) would benefit from a real-time method that can simultaneously track the endoscope and reconstruct the dense 3D geometry of the observed anatomy from a monocular endoscopic video. To this end, we develop a Simultaneous Localization and Mapping system by combining the learning-based appearan...
Article
Full-text available
Ear related concerns and symptoms represent the leading indication for seeking pediatric healthcare attention. Despite the high incidence of such encounters, the diagnostic process of commonly encountered diseases of the middle and external presents a significant challenge. Much of this challenge stems from the lack of cost effective diagnostic tes...
Preprint
Full-text available
Understanding Deep Neural Network (DNN) performance in changing conditions is essential for deploying DNNs in safety critical applications with unconstrained environments, e.g., perception for self-driving vehicles or medical image analysis. Recently, the task of Network Generalization Prediction (NGP) has been proposed to predict how a DNN will ge...
Preprint
Full-text available
Transparency in Machine Learning (ML), attempts to reveal the working mechanisms of complex models. Transparent ML promises to advance human factors engineering goals of human-centered AI in the target users. From a human-centered design perspective, transparency is not a property of the ML model but an affordance, i.e. a relationship between algor...
Conference Paper
Full-text available
Stereo depth estimation relies on optimal correspondence matching between pixels on epipolar lines in the left and right images to infer depth. In this work, we re-visit the problem from a sequence-to-sequence correspondence perspective to replace cost volume construction with dense pixel matching using position information and attention. This appr...
Article
Accurate pin placement to guide renaming of the glenoid surface in total shoulder arthroplasty (TSA) is a critical step to restore range of motion in the glenohumeral joint. Achieving proper pin position with free-hand is complicated due to inadequate intra-operative availability of pre-operative planning data. Mixed reality provides a new modality...
Preprint
Full-text available
Deep neural networks for computer vision tasks are deployed in increasingly safety-critical and socially-impactful applications, motivating the need to close the gap in model performance under varied, naturally occurring imaging conditions. Robustness, ambiguously used in multiple contexts including adversarial machine learning, here then refers to...
Article
Surgical simulators not only allow planning and training of complex procedures, but also offer the ability to generate structured data for algorithm development, which may be applied in image-guided computer assisted interventions. While there have been efforts on either developing training platforms for surgeons or data generation engines, these t...
Preprint
Full-text available
Temporally consistent depth estimation is crucial for real-time applications such as augmented reality. While stereo depth estimation has received substantial attention that led to improvements on a frame-by-frame basis, there is relatively little work focused on maintaining temporal consistency across frames. Indeed, based on our analysis, current...
Preprint
Full-text available
Surgical simulators not only allow planning and training of complex procedures, but also offer the ability to generate structured data for algorithm development, which may be applied in image-guided computer assisted interventions. While there have been efforts on either developing training platforms for surgeons or data generation engines, these t...
Preprint
Full-text available
Ear related concerns and symptoms represents the leading indication for seeking pediatric healthcare attention. Despite the high incidence of such encounters, the diagnostic process of commonly encountered disease of the middle and external presents significant challenge. Much of this challenge stems from the lack of cost effective diagnostic testi...
Chapter
Reconstructing the scene of robotic surgery from the stereo endoscopic video is an important and promising topic in surgical data science, which potentially supports many applications such as surgical visual perception, robotic surgery education and intra-operative context awareness. However, current methods are mostly restricted to reconstructing...
Chapter
Pelvic ring disruptions result from blunt injury mechanisms and are often found in patients with multi-system trauma. To grade pelvic fracture severity in trauma victims based on whole-body CT, the Tile AO/OTA classification is frequently used. Due to the high volume of whole-body trauma CTs generated in busy trauma centers, an automated approach t...
Article
Full-text available
AI relates broadly to the science of developing computer systems to imitate human intelligence, thus allowing for the automation of tasks that would otherwise necessitate human cognition. Such technology has increasingly demonstrated capacity to outperform humans for functions relating to image recognition. Given the current lack of cost-effective...
Preprint
Full-text available
Scene depth estimation from stereo and monocular imagery is critical for extracting 3D information for downstream tasks such as scene understanding. Recently, learning-based methods for depth estimation have received much attention due to their high performance and flexibility in hardware choice. However, collecting ground truth data for supervised...
Article
Background: To date, deep learning-based detection of optic disc abnormalities in color fundus photographs has mostly been limited to the field of glaucoma. However, many life-threatening systemic and neurological conditions can manifest as optic disc abnormalities. In this study, we aimed to extend the application of deep learning (DL) in optic d...
Article
Objective To estimate the effect of achieving target intraocular pressure (IOP) values on visual field (VF) worsening in a treated clinical population. Design Retrospective analysis of longitudinal data. Participants 2,852 eyes of 1,688 patients with glaucoma-related diagnoses treated in a tertiary care practice. All included eyes had at least fi...
Article
Full-text available
Image-based navigation is widely considered the next frontier of minimally invasive surgery. It is believed that image-based navigation will increase the access to reproducible, safe, and high-precision surgery as it may then be performed at acceptable costs and effort. This is because image-based techniques avoid the need of specialized equipment...
Preprint
Full-text available
Algorithmic decision support is rapidly becoming a staple of personalized medicine, especially for high-stakes recommendations in which access to certain information can drastically alter the course of treatment, and thus, patient outcome; a prominent example is radiomics for cancer subtyping. Because in these scenarios the stakes are high, it is d...
Article
With the advent of robotic C-arm computed tomography (CT) systems in medicine and twin-robotic CT systems in industry, new possibilities for the realisation of complex trajectories for CT scans are emerging. These trajectories will increase the range of CT applications, enable optimisation of image quality for many applications and open up new poss...
Preprint
Full-text available
Image-based navigation is widely considered the next frontier of minimally invasive surgery. It is believed that image-based navigation will increase the access to reproducible, safe, and high-precision surgery as it may then be performed at acceptable costs and effort. This is because image-based techniques avoid the need of specialized equipment...
Article
We present an image-based navigation solution for a surgical robotic system with a Continuum Manipulator (CM). Our navigation system uses only fluoroscopic images from a mobile C-arm to estimate the CM shape and pose with respect to the bone anatomy. The CM pose and shape estimation is achieved using image intensity-based 2D/3D registration. A lear...
Preprint
Full-text available
Reconstructing the scene of robotic surgery from the stereo endoscopic video is an important and promising topic in surgical data science, which potentially supports many applications such as surgical visual perception, robotic surgery education and intra-operative context awareness. However, current methods are mostly restricted to reconstructing...
Article
Full-text available
Image stitching is a prominent challenge in medical imaging, where the limited field-of-view captured by single images prohibits holistic analysis of patient anatomy. The barrier that prevents straight-forward mosaicing of 2D images is depth mismatch due to parallax. In this work, we leverage the Fourier slice theorem to aggregate information from...
Preprint
Full-text available
In 2015 we began a sub-challenge at the EndoVis workshop at MICCAI in Munich using endoscope images of ex-vivo tissue with automatically generated annotations from robot forward kinematics and instrument CAD models. However, the limited background variation and simple motion rendered the dataset uninformative in learning about which techniques woul...
Article
We present a novel methodology to detect imperfect bilateral symmetry in CT of human anatomy. In this paper, the structurally symmetric nature of the pelvic bone is explored and is used to provide interventional image augmentation for treatment of unilateral fractures in patients with traumatic injuries. The mathematical basis of our solution is ba...
Conference Paper
Full-text available
Extracting geometric features from 3D models is a common first step in applications such as 3D registration, tracking, and scene flow estimation. Many hand-crafted and learning-based methods aim to produce consistent and distinguishable geometric features for 3D models with partial overlap. These methods work well in cases where the point density a...
Preprint
Full-text available
Pelvic ring disruptions result from blunt injury mechanisms and are often found in patients with multi-system trauma. To grade pelvic fracture severity in trauma victims based on whole-body CT, the Tile AO/OTA classification is frequently used. Due to the high volume of whole-body trauma CTs generated in busy trauma centers, an automated approach t...
Article
As the scope and scale of the COVID-19 pandemic became clear in early March of 2020, the faculty of the Malone Center engaged in several projects aimed at addressing both immediate and long-term implications of COVID-19. In this article, we briefly outline the processes that we engaged in to identify areas of need, the projects that emerged, and th...
Article
Fully automatic X-ray to CT registration requires a solid initialization to provide an initial alignment within the capture range of existing intensity-based registrations. This work adresses that need by providing a novel automatic initialization, which enables end to end registration. First, a neural network is trained once to detect a set of ana...
Article
PurposeMulti- and cross-modal learning consolidates information from multiple data sources which may offer a holistic representation of complex scenarios. Cross-modal learning is particularly interesting, because synchronized data streams are immediately useful as self-supervisory signals. The prospect of achieving self-supervised continual learnin...
Article
Full-text available
Admission trauma whole-body CT is routinely employed as a first-line diagnostic tool for characterizing pelvic fracture severity. Tile AO/OTA grade based on the presence or absence of rotational and translational instability corresponds with need for interventions including massive transfusion and angioembolization. An automated method could be hig...
Chapter
Facilitating quantitative analysis of cytology images of fine needle aspirates of uveal melanoma is important to confirm diagnosis and inform management decisions. Extracting high-quality regions of interest (ROIs) from cytology whole slide images is a critical first step. To the best of our knowledge, we describe the first unsupervised clustering-...
Preprint
Full-text available
Fully automatic X-ray to CT registration requires a solid initialization to provide an initial alignment within the capture range of existing intensity-based registrations. This work adresses that need by providing a novel automatic initialization, which enables end to end registration. First, a neural network is trained once to detect a set of ana...
Article
Head-mounted loupes can increase the user's visual acuity to observe the details of an object. On the other hand, optical see-through head-mounted displays (OST-HMD) are able to provide virtual augmentations registered with real objects. In this paper, we propose AR-Loupe, combining the advantages of loupes and OST-HMDs, to offer augmented reality...
Article
Full-text available
Suboptimal interaction with patient data and challenges in mastering 3D anatomy based on ill-posed 2D interventional images are essential concerns in image-guided therapies. Augmented reality (AR) has been introduced in the operating rooms in the last decade; however, in image-guided interventions, it has often only been considered as a visualizati...
Preprint
Full-text available
Stereo depth estimation relies on optimal correspondence matching between pixels on epipolar lines in the left and right image to infer depth. Rather than matching individual pixels, in this work, we revisit the problem from a sequence-to-sequence correspondence perspective to replace cost volume construction with dense pixel matching using positio...
Preprint
Full-text available
Automatic surgical gesture recognition is fundamentally important to enable intelligent cognitive assistance in robotic surgery. With recent advancement in robot-assisted minimally invasive surgery, rich information including surgical videos and robotic kinematics can be recorded, which provide complementary knowledge for understanding surgical ges...
Preprint
Full-text available
Self-supervised, multi-modal learning has been successful in holistic representation of complex scenarios. This can be useful to consolidate information from multiple modalities which have multiple, versatile uses. Its application in surgical robotics can lead to simultaneously developing a generalised machine understanding of the surgical process...
Article
Full-text available
Total Shoulder Arthroplasty (TSA) is a shoulder replacement procedure to treat severe rotator cuff deficiency, primarily caused by osteoarthritis in elderly patients. One of the critical factors in reducing postoperative complications is accurate drilling of a centring hole on the glenoid surface at a precise position and orientation. While the dri...
Article
Purpose: Many interventional procedures aim at changing soft tissue perfusion or blood flow. One problem at present is that soft tissue perfusion and its changes cannot be assessed in an interventional suite because cone-beam computed tomography is too slow (it takes 4-10 s per volume scan). In order to address the problem, we propose a novel meth...
Chapter
Full-text available
Traditional intensity-based 2D/3D registration requires near-perfect initialization in order for image similarity metrics to yield meaningful updates of X-ray pose and reduce the likelihood of getting trapped in a local minimum. The conventional approaches strongly depend on image appearance rather than content, and therefore, fail in revealing lar...
Preprint
Full-text available
In response to the rapid spread of the novel coronavirus, SARS-CoV-2, the U.S. has largely delegated implementation and rollback of non-pharmaceutical interventions (NPIs) to local governments on the state and county level. This asynchronous response combined with the heterogeneity of the U.S. complicates quantification of the effect of NPIs on the...
Conference Paper
Full-text available
Reconstructing accurate 3D surface models of sinus anatomy directly from an endoscopic video is a promising avenue for cross-sectional and longitudinal analysis to better understand the relationship between sinus anatomy and surgical outcomes. We present a patient-specific, learning-based method for 3D reconstruction of sinus surface anatomy direct...