About
184
Publications
25,155
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,496
Citations
Introduction
I lead the Advanced Robotics and Computationally AugmenteD Environments (ARCADE) lab. Our research is at the intersection of computer vision, machine learning, augmented reality, robotics, and medical imaging to develop collaborative systems that advance decision making across the healthcare spectrum.
Additional affiliations
Publications
Publications (184)
Artificial intelligence (AI) now enables automated interpretation of medical images for clinical use. However, AI's potential use for interventional images (versus those involved in triage or diagnosis), such as for guidance during surgery, remains largely untapped. This is because surgical AI systems are currently trained using post hoc analysis o...
Purpose:
Patient motion artifacts present a prevalent challenge to image quality in interventional cone-beam CT (CBCT). We propose a novel reference-free similarity metric (DL-VIF) that leverages the capability of deep convolutional neural networks (CNN) to learn features associated with motion artifacts within realistic anatomical features. DL-VI...
Purpose:
The purpose of this study was to accurately forecast future reliable visual field (VF) mean deviation (MD) values by correcting for poor reliability.
Methods:
Four linear regression techniques (standard, unfiltered, corrected, and weighted) were fit to VF data from 5939 eyes with a final reliable VF. For each eye, all VFs, except the fi...
Vision-based segmentation of the robotic tool during robot-assisted surgery enables downstream applications, such as augmented reality feedback, while allowing for inaccuracies in robot kinematics. With the introduction of deep learning, many methods were presented to solve instrument segmentation directly and solely from images. While these approa...
Purpose:
Mixed reality (MR) for image-guided surgery may enable unobtrusive solutions for precision surgery. To display preoperative treatment plans at the correct physical position, it is essential to spatially align it with the patient intra-operatively. Accurate alignment is safety critical because it will guide treatment, but cannot always be...
In endoscopy, many applications (e.g., surgical navigation) would benefit from a real-time method that can simultaneously track the endoscope and reconstruct the dense 3D geometry of the observed anatomy from a monocular endoscopic video. To this end, we develop a Simultaneous Localization and Mapping system by combining the learning-based appearan...
Ear related concerns and symptoms represent the leading indication for seeking pediatric healthcare attention. Despite the high incidence of such encounters, the diagnostic process of commonly encountered diseases of the middle and external presents a significant challenge. Much of this challenge stems from the lack of cost effective diagnostic tes...
Understanding Deep Neural Network (DNN) performance in changing conditions is essential for deploying DNNs in safety critical applications with unconstrained environments, e.g., perception for self-driving vehicles or medical image analysis. Recently, the task of Network Generalization Prediction (NGP) has been proposed to predict how a DNN will ge...
Transparency in Machine Learning (ML), attempts to reveal the working mechanisms of complex models. Transparent ML promises to advance human factors engineering goals of human-centered AI in the target users. From a human-centered design perspective, transparency is not a property of the ML model but an affordance, i.e. a relationship between algor...
Stereo depth estimation relies on optimal correspondence matching between pixels on epipolar lines in the left and right images to infer depth. In this work, we re-visit the problem from a sequence-to-sequence correspondence perspective to replace cost volume construction with dense pixel matching using position information and attention. This appr...
Accurate pin placement to guide renaming of the glenoid surface in total shoulder arthroplasty (TSA) is a critical step to restore range of motion in the glenohumeral joint. Achieving proper pin position with free-hand is complicated due to inadequate intra-operative availability of pre-operative planning data. Mixed reality provides a new modality...
Deep neural networks for computer vision tasks are deployed in increasingly safety-critical and socially-impactful applications, motivating the need to close the gap in model performance under varied, naturally occurring imaging conditions. Robustness, ambiguously used in multiple contexts including adversarial machine learning, here then refers to...
Surgical simulators not only allow planning and training of complex procedures, but also offer the ability to generate structured data for algorithm development, which may be applied in image-guided computer assisted interventions. While there have been efforts on either developing training platforms for surgeons or data generation engines, these t...
Temporally consistent depth estimation is crucial for real-time applications such as augmented reality. While stereo depth estimation has received substantial attention that led to improvements on a frame-by-frame basis, there is relatively little work focused on maintaining temporal consistency across frames. Indeed, based on our analysis, current...
Surgical simulators not only allow planning and training of complex procedures, but also offer the ability to generate structured data for algorithm development, which may be applied in image-guided computer assisted interventions. While there have been efforts on either developing training platforms for surgeons or data generation engines, these t...
Ear related concerns and symptoms represents the leading indication for seeking pediatric healthcare attention. Despite the high incidence of such encounters, the diagnostic process of commonly encountered disease of the middle and external presents significant challenge. Much of this challenge stems from the lack of cost effective diagnostic testi...
Reconstructing the scene of robotic surgery from the stereo endoscopic video is an important and promising topic in surgical data science, which potentially supports many applications such as surgical visual perception, robotic surgery education and intra-operative context awareness. However, current methods are mostly restricted to reconstructing...
Pelvic ring disruptions result from blunt injury mechanisms and are often found in patients with multi-system trauma. To grade pelvic fracture severity in trauma victims based on whole-body CT, the Tile AO/OTA classification is frequently used. Due to the high volume of whole-body trauma CTs generated in busy trauma centers, an automated approach t...
AI relates broadly to the science of developing computer systems to imitate human intelligence, thus allowing for the automation of tasks that would otherwise necessitate human cognition. Such technology has increasingly demonstrated capacity to outperform humans for functions relating to image recognition. Given the current lack of cost-effective...
Scene depth estimation from stereo and monocular imagery is critical for extracting 3D information for downstream tasks such as scene understanding. Recently, learning-based methods for depth estimation have received much attention due to their high performance and flexibility in hardware choice. However, collecting ground truth data for supervised...
Background:
To date, deep learning-based detection of optic disc abnormalities in color fundus photographs has mostly been limited to the field of glaucoma. However, many life-threatening systemic and neurological conditions can manifest as optic disc abnormalities. In this study, we aimed to extend the application of deep learning (DL) in optic d...
Objective
To estimate the effect of achieving target intraocular pressure (IOP) values on visual field (VF) worsening in a treated clinical population.
Design
Retrospective analysis of longitudinal data.
Participants
2,852 eyes of 1,688 patients with glaucoma-related diagnoses treated in a tertiary care practice. All included eyes had at least fi...
Image-based navigation is widely considered the next frontier of minimally invasive surgery. It is believed that image-based navigation will increase the access to reproducible, safe, and high-precision surgery as it may then be performed at acceptable costs and effort. This is because image-based techniques avoid the need of specialized equipment...
Algorithmic decision support is rapidly becoming a staple of personalized medicine, especially for high-stakes recommendations in which access to certain information can drastically alter the course of treatment, and thus, patient outcome; a prominent example is radiomics for cancer subtyping. Because in these scenarios the stakes are high, it is d...
With the advent of robotic C-arm computed tomography (CT) systems in medicine and twin-robotic CT systems in industry, new possibilities for the realisation of complex trajectories for CT scans are emerging. These trajectories will increase the range of CT applications, enable optimisation of image quality for many applications and open up new poss...
Image-based navigation is widely considered the next frontier of minimally invasive surgery. It is believed that image-based navigation will increase the access to reproducible, safe, and high-precision surgery as it may then be performed at acceptable costs and effort. This is because image-based techniques avoid the need of specialized equipment...
We present an image-based navigation solution for a surgical robotic system with a Continuum Manipulator (CM). Our navigation system uses only fluoroscopic images from a mobile C-arm to estimate the CM shape and pose with respect to the bone anatomy. The CM pose and shape estimation is achieved using image intensity-based 2D/3D registration. A lear...
Reconstructing the scene of robotic surgery from the stereo endoscopic video is an important and promising topic in surgical data science, which potentially supports many applications such as surgical visual perception, robotic surgery education and intra-operative context awareness. However, current methods are mostly restricted to reconstructing...
Image stitching is a prominent challenge in medical imaging, where the limited field-of-view captured by single images prohibits holistic analysis of patient anatomy. The barrier that prevents straight-forward mosaicing of 2D images is depth mismatch due to parallax. In this work, we leverage the Fourier slice theorem to aggregate information from...
In 2015 we began a sub-challenge at the EndoVis workshop at MICCAI in Munich using endoscope images of ex-vivo tissue with automatically generated annotations from robot forward kinematics and instrument CAD models. However, the limited background variation and simple motion rendered the dataset uninformative in learning about which techniques woul...
We present a novel methodology to detect imperfect bilateral symmetry in CT of human anatomy. In this paper, the structurally symmetric nature of the pelvic bone is explored and is used to provide interventional image augmentation for treatment of unilateral fractures in patients with traumatic injuries. The mathematical basis of our solution is ba...
Extracting geometric features from 3D models is a common first step in applications such as 3D registration, tracking, and scene flow estimation. Many hand-crafted and learning-based methods aim to produce consistent and distinguishable geometric features for 3D models with partial overlap. These methods work well in cases where the point density a...
Pelvic ring disruptions result from blunt injury mechanisms and are often found in patients with multi-system trauma. To grade pelvic fracture severity in trauma victims based on whole-body CT, the Tile AO/OTA classification is frequently used. Due to the high volume of whole-body trauma CTs generated in busy trauma centers, an automated approach t...
As the scope and scale of the COVID-19 pandemic became clear in early March of 2020, the faculty of the Malone Center engaged in several projects aimed at addressing both immediate and long-term implications of COVID-19. In this article, we briefly outline the processes that we engaged in to identify areas of need, the projects that emerged, and th...
Fully automatic X-ray to CT registration requires a solid initialization to provide an initial alignment within the capture range of existing intensity-based registrations. This work adresses that need by providing a novel automatic initialization, which enables end to end registration. First, a neural network is trained once to detect a set of ana...
PurposeMulti- and cross-modal learning consolidates information from multiple data sources which may offer a holistic representation of complex scenarios. Cross-modal learning is particularly interesting, because synchronized data streams are immediately useful as self-supervisory signals. The prospect of achieving self-supervised continual learnin...
Admission trauma whole-body CT is routinely employed as a first-line diagnostic tool for characterizing pelvic fracture severity. Tile AO/OTA grade based on the presence or absence of rotational and translational instability corresponds with need for interventions including massive transfusion and angioembolization. An automated method could be hig...
Facilitating quantitative analysis of cytology images of fine needle aspirates of uveal melanoma is important to confirm diagnosis and inform management decisions. Extracting high-quality regions of interest (ROIs) from cytology whole slide images is a critical first step. To the best of our knowledge, we describe the first unsupervised clustering-...
Fully automatic X-ray to CT registration requires a solid initialization to provide an initial alignment within the capture range of existing intensity-based registrations. This work adresses that need by providing a novel automatic initialization, which enables end to end registration. First, a neural network is trained once to detect a set of ana...
Head-mounted loupes can increase the user's visual acuity to observe the details of an object. On the other hand, optical see-through head-mounted displays (OST-HMD) are able to provide virtual augmentations registered with real objects. In this paper, we propose AR-Loupe, combining the advantages of loupes and OST-HMDs, to offer augmented reality...
Suboptimal interaction with patient data and challenges in mastering 3D anatomy based on ill-posed 2D interventional images are essential concerns in image-guided therapies. Augmented reality (AR) has been introduced in the operating rooms in the last decade; however, in image-guided interventions, it has often only been considered as a visualizati...
Stereo depth estimation relies on optimal correspondence matching between pixels on epipolar lines in the left and right image to infer depth. Rather than matching individual pixels, in this work, we revisit the problem from a sequence-to-sequence correspondence perspective to replace cost volume construction with dense pixel matching using positio...
Automatic surgical gesture recognition is fundamentally important to enable intelligent cognitive assistance in robotic surgery. With recent advancement in robot-assisted minimally invasive surgery, rich information including surgical videos and robotic kinematics can be recorded, which provide complementary knowledge for understanding surgical ges...
Self-supervised, multi-modal learning has been successful in holistic representation of complex scenarios. This can be useful to consolidate information from multiple modalities which have multiple, versatile uses. Its application in surgical robotics can lead to simultaneously developing a generalised machine understanding of the surgical process...
Total Shoulder Arthroplasty (TSA) is a shoulder replacement procedure to treat severe rotator cuff deficiency, primarily caused by osteoarthritis in elderly patients. One of the critical factors in reducing postoperative complications is accurate drilling of a centring hole on the glenoid surface at a precise position and orientation. While the dri...
Purpose:
Many interventional procedures aim at changing soft tissue perfusion or blood flow. One problem at present is that soft tissue perfusion and its changes cannot be assessed in an interventional suite because cone-beam computed tomography is too slow (it takes 4-10 s per volume scan). In order to address the problem, we propose a novel meth...
Traditional intensity-based 2D/3D registration requires near-perfect initialization in order for image similarity metrics to yield meaningful updates of X-ray pose and reduce the likelihood of getting trapped in a local minimum. The conventional approaches strongly depend on image appearance rather than content, and therefore, fail in revealing lar...
In response to the rapid spread of the novel coronavirus, SARS-CoV-2, the U.S. has largely delegated implementation and rollback of non-pharmaceutical interventions (NPIs) to local governments on the state and county level. This asynchronous response combined with the heterogeneity of the U.S. complicates quantification of the effect of NPIs on the...
Reconstructing accurate 3D surface models of sinus anatomy directly from an endoscopic video is a promising avenue for cross-sectional and longitudinal analysis to better understand the relationship between sinus anatomy and surgical outcomes. We present a patient-specific, learning-based method for 3D reconstruction of sinus surface anatomy direct...