Conference PaperPDF Available

Dense Surface Reconstruction for Enhanced Navigation in MIS

Authors:

Abstract and Figures

Recent introduction of dynamic view expansion has led to the development of computer vision methods for minimally invasive surgery to artificially expand the intra-operative field-of-view of the laparoscope. This provides improved awareness of the surrounding anatomical structures and minimises the effect of disorientation during surgical navigation. It permits the augmentation of live laparoscope images with information from previously captured views. Current approaches, however, can only represent the tissue geometry as planar surfaces or sparse 3D models, thus introducing noticeable visual artefacts in the final rendering results. This paper proposes a high-fidelity tissue geometry mapping by combining a sparse SLAM map with semi-dense surface reconstruction. The method is validated on phantom data with known ground truth, as well as in-vivo data captured during a robotic assisted MIS procedure. The derived results have shown that the method is able to effectively increase the coverage of the expanded surgical view without compromising mapping accuracy.
Content may be subject to copyright.
A preview of the PDF is not available
... Dense SLAM methods have also been developed to generate dense tissue models in real-time. Totz et al [32] proposed an EKF-SLAM-based method for dense reconstruction. EKF-SLAM suffers from low accuracy and is difficult for representing loop closing. ...
Preprint
We propose an approach to reconstruct dense three-dimensional (3D) model of tissue surface from stereo optical videos in real-time, the basic idea of which is to first extract 3D information from video frames by using stereo matching, and then to mosaic the reconstructed 3D models. To handle the common low texture regions on tissue surfaces, we propose effective post-processing steps for the local stereo matching method to enlarge the radius of constraint, which include outliers removal, hole filling and smoothing. Since the tissue models obtained by stereo matching are limited to the field of view of the imaging modality, we propose a model mosaicking method by using a novel feature-based simultaneously localization and mapping (SLAM) method to align the models. Low texture regions and the varying illumination condition may lead to a large percentage of feature matching outliers. To solve this problem, we propose several algorithms to improve the robustness of SLAM, which mainly include (1) a histogram voting-based method to roughly select possible inliers from the feature matching results, (2) a novel 1-point RANSAC-based P$n$P algorithm called as DynamicR1PP$n$P to track the camera motion and (3) a GPU-based iterative closest points (ICP) and bundle adjustment (BA) method to refine the camera motion estimation results. Experimental results on ex- and in vivo data showed that the reconstructed 3D models have high resolution texture with an accuracy error of less than 2 mm. Most algorithms are highly parallelized for GPU computation, and the average runtime for processing one key frame is 76.3 ms on stereo images with 960x540 resolution.
... The concept of non-rigid SLAM was proposed in the DynamicFusion work [20], and is now an emerging topic in the computer vision field. Unlike the traditional rigid SLAM methods that estimate the 6-DoF rigid motion of the camera [21], non-rigid SLAM estimates the deformation and motion of the environment with respect to the camera, which usually has high degrees of freedom. Our 2D non-rigid SLAM method considers the 2D image mosaic as the environment map, which is similar to the 3D point cloud built by traditional 3D SLAM methods. ...
Preprint
Full-text available
The ability to extend the field of view of laparoscopy images can help the surgeons to obtain a better understanding of the anatomical context. However, due to tissue deformation, complex camera motion and significant three-dimensional (3D) anatomical surface, image pixels may have non-rigid deformation and traditional mosaicking methods cannot work robustly for laparoscopy images in real-time. To solve this problem, a novel two-dimensional (2D) non-rigid simultaneous localization and mapping (SLAM) system is proposed in this paper, which is able to compensate for the deformation of pixels and perform image mosaicking in real-time. The key algorithm of this 2D non-rigid SLAM system is the expectation maximization and dual quaternion (EMDQ) algorithm, which can generate smooth and dense deformation field from sparse and noisy image feature matches in real-time. An uncertainty-based loop closing method has been proposed to reduce the accumulative errors. To achieve real-time performance, both CPU and GPU parallel computation technologies are used for dense mosaicking of all pixels. Experimental results on \textit{in vivo} and synthetic data demonstrate the feasibility and accuracy of our non-rigid mosaicking method.
... The first approach, used in traditional laparoscopy, is based on moving a monocular endoscope in order to reconstruct the 3D surface of the surgical area. Three methods are commonly used to obtain depth information: Structure from Motion (SfM) [3,26], SLAM [27,28], and Shape from Shading (SfS) [29]. However, a disadvantage for both SfM and SLAM is that the camera needs to move constantly in order to obtain 3D information. ...
Article
Full-text available
Purpose The minimally invasive surgery (MIS) has shown advantages when compared to traditional surgery. However, there are two major challenges in the MIS technique: the limited field of view (FOV) and the lack of depth perception provided by the standard monocular endoscope. Therefore, in this study, we proposed a New Endoscope for Panoramic-View with Focus-Area 3D-Vision (3DMISPE) in order to provide surgeons with a broad view field and 3D images in the surgical area for real-time display. Method The proposed system consisted of two endoscopic cameras fixed to each other. Compared to our previous study, the proposed algorithm for the stitching videos was novel. This proposed stitching algorithm was based on the stereo vision synthesis theory. Thus, this new method can support 3D reconstruction and image stitching at the same time. Moreover, our approach employed the same functions on reconstructing 3D surface images by calculating the overlap region’s disparity and performing image stitching with the two-view images from both the cameras. Results The experimental results demonstrated that the proposed method can combine two endoscope’s FOV into one wider FOV. In addition, the part in the overlap region could also be synthesized for a 3D display to provide more information about depth and distance, with an error of about 1 mm. In the proposed system, the performance could achieve a frame rate of up to 11.3 fps on a single Intel i5-4590 CPU computer and 17.6 fps on a computer with an additional GTX1060 Nvidia GeForce GPU. Furthermore, the proposed stitching method in this study could be made 1.4 times after when compared to that in our previous report. Besides, our method also improved stitched image quality by significantly reducing the alignment errors or “ghosting” when compared to the SURF-based stitching method employed in our previous study. Conclusion The proposed system can provide a more efficient way for the doctors with a broad area of view while still providing a 3D surface image in real-time applications. Our system give promises to improve existing limitations in laparoscopic surgery such as the limited FOV and the lack of depth perception.
... Mosaicking has recently gained attention to increase the FoV in fetoscopy [10,3,11,12,9]. Totz et al. [13] presented a dynamic view expansion and surface reconstruction approach for minimally invasive surgery by analyzing stereo laparoscopy videos. Reeff at al. [10] and Daga et al. [3] utilized a classical image feature-based matching method for creating mosaics from planar placenta images. ...
Chapter
Twin-to-twin transfusion syndrome treatment requires fetoscopic laser photocoagulation of placental vascular anastomoses to regulate blood flow to both fetuses. Limited field-of-view (FoV) and low visual quality during fetoscopy make it challenging to identify all vascular connections. Mosaicking can align multiple overlapping images to generate an image with increased FoV, however, existing techniques apply poorly to fetoscopy due to the low visual quality, texture paucity, and hence fail in longer sequences due to the drift accumulated over time. Deep learning techniques can facilitate in overcoming these challenges. Therefore, we present a new generalized Deep Sequential Mosaicking (DSM) framework for fetoscopic videos captured from different settings such as simulation, phantom, and real environments. DSM extends an existing deep image-based homography model to sequential data by proposing controlled data augmentation and outlier rejection methods. Unlike existing methods, DSM can handle visual variations due to specular highlights and reflection across adjacent frames, hence reducing the accumulated drift. We perform experimental validation and comparison using 5 diverse fetoscopic videos to demonstrate the robustness of our framework.
Article
Full-text available
Objective: 3D reconstruction of the shape and texture of hollow organs captured by endoscopy is important for the diagnosis and surveillance of early and recurrent cancers. Better evaluation of 3D reconstruction pipelines developed for such applications requires easy access to extensive datasets and associated ground truths, cost-efficient and scalable simulations of a range of possible clinical scenarios, and more reliable and insightful metrics to assess performance. Methods: We present a computer-aided simulation platform for cost-effective synthesis of monocular endoscope videos and corresponding ground truths that mimic a range of potential settings and situations one might encounter during acquisition of clinical endoscopy videos. Using cystoscopy of the bladder as model case, we generated an extensive dataset comprising several synthesized videos of a bladder phantom. We then introduce a novel evaluation procedure to reliably assess an individual 3D reconstruction pipeline or to compare different pipelines. Results: To illustrate the use of the proposed platform and evaluation procedure, we use the aforementioned dataset and ground truths to evaluate a proprietary 3D reconstruction pipeline (CYSTO3D) for bladder cystoscopy videos and compared it with a general-purpose 3D reconstruction pipeline (COLMAP). The evaluation results provide insight into the suggested clinical acquisition protocol and several potential areas for refinement of the pipeline to improve future performance. Conclusion: Our work proposes an endoscope video synthesis and reconstruction evaluation toolset and presents experimental results that illustrate usage of the toolset to efficiently assess performance and reveal possible problems of any given 3D reconstruction pipeline, to compare different pipelines, and to provide technically or clinically actionable insights.
Chapter
We propose a novel stereo laparoscopy video-based non-rigid SLAM method called EMDQ-SLAM, which can incrementally reconstruct thee-dimensional (3D) models of soft tissue surfaces in real-time and preserve high-resolution color textures. EMDQ-SLAM uses the expectation maximization and dual quaternion (EMDQ) algorithm combined with SURF features to track the camera motion and estimate tissue deformation between video frames. To overcome the problem of accumulative errors over time, we have integrated a g2o-based graph optimization method that combines the EMDQ mismatch removal and as-rigid-as-possible (ARAP) smoothing methods. Finally, the multi-band blending (MBB) algorithm has been used to obtain high resolution color textures with real-time performance. Experimental results demonstrate that our method outperforms two state-of-the-art non-rigid SLAM methods: MISSLAM and DefSLAM. Quantitative evaluation shows an average error in the range of 0.8–2.2 mm for different cases.
Article
The ability to extend the field of view of laparoscopy images can help the surgeons to obtain a better understanding of the anatomical context. However, due to tissue deformation, complex camera motion and significant three-dimensional (3D) anatomical surface, image pixels may have non-rigid deformation and traditional mosaicking methods cannot work robustly for laparoscopy images in real-time. To solve this problem, a novel two-dimensional (2D) non-rigid simultaneous localization and mapping (SLAM) system is proposed in this paper, which is able to compensate for the deformation of pixels and perform image mosaicking in real-time. The key algorithm of this 2D non-rigid SLAM system is the expectation maximization and dual quaternion (EMDQ) algorithm, which can generate smooth and dense deformation field from sparse and noisy image feature matches in real-time. An uncertainty-based loop closing method has been proposed to reduce the accumulative errors. To achieve real-time performance, both CPU and GPU parallel computation technologies are used for dense mosaicking of all pixels. Experimental results on in vivo and synthetic data demonstrate the feasibility and accuracy of our non-rigid mosaicking method.
Article
Markerless tracking has been a trend in augmented reality (AR) applicationsnowadays, but it no longer satisfies users who want virtual characters to inter-act with the real world such as collision. Some sparse or dense simultane-ous localization and mapping (SLAM) methods are proposed aiming to solvethis problem. However, sparse methods only extract a plane from the sparsemap, which cannot allow virtual characters to move realistically. Meanwhile,dense methods usually require powerful graphics processing unit (GPU) fordense mapping. In this paper, we present a real-time AR framework basedon a semi-dense method with central processing unit (CPU). Specifically, thesemi-dense method searches pixels with high gradients in each keyframe andestimates accurate depths by fusing matching pixels in other keyframes. Wepropose an outlier removal method that excludes three-dimensional points out-side the camera trajectory. By integrating this method, our framework preservesclean edges of the real environment. The experimental results on the datasetshow that our proposed framework has better surface reconstruction accuracythan other methods and our tracking thread runs in an acceptable speed whenthe semi-dense mapping thread runs backend. With the benefit of the robustcameratrackingandthealignedsurface,virtualcharactersofourARapplicationenable realistic movement and collision.
Article
Full-text available
Digital inpainting provides a means for reconstruction of small damaged portions of an image. Although the inpainting basics are straightforward, most inpainting techniques published in the literature are complex to understand and implement. We present here a new algorithm for digital inpainting based on the fast marching method for level set applications. Our algorithm is very simple to implement, fast, and produces nearly identical results to more complex, and usually slower, known methods. Source code is available online.
Article
Full-text available
Endoscopes used in minimally invasive surgery provide a limited field of view, thus requiring a high degree of spatial awareness and orientation. Attempts at expanding this small, restricted view with previously observed imagery have been made by researchers and is generally known as image mosaicing or dynamic view expansion. For minimally invasive endoscopy, SLAM-based methods have been shown to have potential values but have yet to address effective visualisation techniques. The live endoscopic video feed is expanded with previously observed footage. To this end, a method that highlights the difference between actual camera image and historic data observed earlier is proposed. Old video data is faded out to grey scale to mimic human peripheral vision. Specular highlights are removed with the help of texture synthesis to avoid distracting visual cues. The method is further evaluated on in vivo and phantom sequences by a detailed user study to examine the ability of the user in discerning temporal motion trajectories while visualising the expanded field of view, a feature that is of practical value for enhancing spatial awareness and orientation. The difference between historic data and live video is integrated effectively. The use of a single texture domain generated by planar parameterisation is demonstrated for view expansion. Specular highlights can be removed through texture synthesis without introducing noticeable artefacts. The implicit encoding of motion trajectory of the endoscopic camera visualised by the proposed method facilitates both global awareness and temporal evolution of the scene. Dynamic view expansion provides more context for navigation and orientation by establishing reference points beyond the camera's field of view. Effective integration of visual cues is paramount for concise visualisation.
Conference Paper
Full-text available
The recovery of 3D tissue structure and morphology during robotic assisted surgery is an important step towards accurate deployment of surgical guidance and control techniques in minimally invasive therapies. In this article, we present a novel stereo reconstruction algorithm that propagates disparity information around a set of candidate feature matches. This has the advantage of avoiding problems with specular highlights, occlusions from instruments and view dependent illumination bias. Furthermore, the algorithm can be used with any feature matching strategy allowing the propagation of depth in very disparate views. Validation is provided for a phantom model with known geometry and this data is available online in order to establish a structured validation scheme in the field. The practical value of the proposed method is further demonstrated by reconstructions on various in vivo images of robotic assisted procedures, which are also available to the community. Additional material can be found at http://ubimon.doc.ic.ac.uk/dvs/m857.html .
Article
Full-text available
Navigation during Minimally Invasive Surgery (MIS) has recognized difficulties due to limited field-of-view, off-axis visualization and loss of direct 3D vision. This can cause visual-spatial disorientation when exploring complex in vivo structures. In this paper, we present an approach to dynamic view expansion which builds a 3D textured model of the MIS environment to facilitate in vivo navigation. With the proposed technique, no prior knowledge of the environment is required and the model is built sequentially while the laparoscope is moved. The method is validated on simulated data with known ground truth. Its potential clinical value is also demonstrated with in vivo experiments.
Article
We approach mosaicing as a camera tracking problem within a known parameterized surface. From a video of a camera moving within a surface, we compute a mosaic representing the texture of that surface, flattened onto a planar image. Our approach works by defining a warp between images as a function of surface geometry and camera pose. Globally optimizing this warp to maximize alignment across all frames determines the camera trajectory, and the corresponding flattened mosaic image. In contrast to previous mosaicing methods which assume planar or distant scenes, or controlled camera motion, our approach enables mosaicing in cases where the camera moves unpredictably through proximal surfaces, such as in medical endoscopy applications.
Conference Paper
Constructing a mosaicing image with a broader field-of-view has become an important topic in image guided diagnosis and treatment. In this paper, we present a robust feature-based method for video mosaicing with super-resolution for optical medical images. Firstly, outliers involved in the feature dataset are removed using trilinear constraints and iterative bundle adjustment, then a minimal cost graph path is built for mosaicing using topology inference. Finally, a mosaicing image with super-resolution is created by way of maximum a posterior (MAP) estimation and selective initialization. The proposed method has been tested with both endoscopic images from totally endoscopic coronary artery bypass surgery and fibered confocal microscopy images. The results showed our method performs better than previously reported methods in terms of accuracy and robustness to deformation and artefacts.
Article
Recent advances in surgical robotics have provided a platform for extending the current capabilities of minimally invasive surgery by incorporating both preoperative and intraoperative imaging data. In this tutorial article, we introduce techniques for in vivo three-dimensional (3-D) tissue deformation recovery and tracking based on laparoscopic or endoscopic images. These optically based techniques provide a unique opportunity for recovering surface deformation of the soft tissue without the need of additional instrumentation. They can therefore be easily incorporated into the existing surgical workflow. Technically, the problem formulation is challenging due to nonrigid deformation of the tissue and instrument interaction. Current approaches and future research directions in terms of intraoperative planning and adaptive surgical navigation are explained in detail.
Conference Paper
In this paper we present a real-time intra-operative reconstruction system for laparoscopic surgery. The system builds upon a surgical robot for laparoscopy that has previously been developed by us. Such a system is valuable for surgeons, who can get a three dimensional visualization of the scene online, without having to postprocess data. We gain a significant speed increase over existing such systems by carefully parallelizing tasks and using the GPU for computationally expensive sub-tasks, making real-time reconstruction and visualization possible. Our implementation is also robust with respect to outliers and can potentially be extended to be used with non-robotic surgery. We demonstrate the performance of our system on ex-vivo samples and compare it to alternative implementations.
We present a method which enables rapid and dense reconstruction of scenes browsed by a single live camera. We take point-based real-time structure from motion (SFM) as our starting point, generating accurate 3D camera pose estimates and a sparse point cloud. Our main novel contribution is to use an approximate but smooth base mesh generated from the SFM to predict the view at a bundle of poses around automatically selected reference frames spanning the scene, and then warp the base mesh into highly accurate depth maps based on view-predictive optical flow and a constrained scene flow update. The quality of the resulting depth maps means that a convincing global scene model can be obtained simply by placing them side by side and removing overlapping regions. We show that a cluttered indoor environment can be reconstructed from a live hand-held camera in a few seconds, with all processing performed by current desktop hardware. Real-time monocular dense reconstruction opens up many application areas, and we demonstrate both real-time novel view synthesis and advanced augmented reality where augmentations interact physically with the 3D scene and are correctly clipped by occlusions.
Article
Recent advances in surgical robotics have provided a platform for extending the current capabilities of minimally invasive surgery by incorporating both preoperative and intraoperative imaging data. In this tutorial article, we introduce techniques for in vivo threedimensional (3-D) tissue deformation recovery and tracking based on laparoscopic or endoscopic images. These optically based techniques provide a unique opportunity for recovering surface deformation of the soft tissue without the need of additional instrumentation. They can therefore be easily incorporated into the existing surgical workflow. Technically, the problem formulation is challenging due to nonrigid deformation of the tissue and instrument interaction. Current approaches and future research directions in terms of intraoperative planning and adaptive surgical navigation are explained in detail.