Fig 11 - uploaded by Amilcar Alzaga
Content may be subject to copyright.
Example of Augmented Reality visualization of the resection model (case 2 in this study) from the surgical navigation solution, presented in this study, together with the marked resection margin and the surgical foam used for TRE evaluation in resection margin analysis.
Source publication
In laparoscopic liver resection, surgeons conventionally rely on anatomical landmarks detected through a laparoscope, preoperative volumetric images and laparoscopic ultrasound to compensate for the challenges of minimally invasive access. Image guidance using optical tracking and registration procedures is a promising tool, although often undermin...
Similar publications
Artificial intelligence makes surgical resection easier and safer, and, at the same time, can improve oncological results. The robotic system fits perfectly with these more or less diffused technologies, and it seems that this benefit is mutual. In liver surgery, robotic systems help surgeons to localize tumors and improve surgical results with wel...
Citations
... Pelanis et%al. 8 proposed using ducials injected into the liver parenchyma with &uoroscopic registration updates. Falkenberg et%al. ...
... Video to CT registration was initially attempted through a 3D-3D formulation where the intra-operative surface of the liver is reconstructed from video using laser range scan-ners [4], intra-operative CT [5] and stereo laparoscopes [6,7]. From these options, only stereo laparoscopes are compatible with the current scenario of laparoscopic surgery. ...
... Then, we extract relevant liver features for registration (2) and encode them with a convolutional neural network (CNN)-based encoder to create a database that matches camera pose to hash code (3). Intra-operatively, video images are segmented for the same features extracted in the database generation step (4), encoded with the same model (5), and a nearest neighbour search is used to find the closest hash codes in Euclidean space (6). Finally, a refinement based on Hausdorff distance is applied to estimate the final camera pose and provide an AR overlay (7). ...
Purpose
Registration of computed tomography (CT) to laparoscopic video images is vital to enable augmented reality (AR), a technology that holds the promise of minimising the risk of complications during laparoscopic liver surgery. Although several solutions have been presented in the literature, they always rely on an accurate initialisation of the registration that is either obtained manually or automatically estimated on very specific views of the liver. These limitations pose a challenge to the clinical translation of AR.
Methods
We propose the use of a content-based image retrieval (CBIR) framework to obtain an automatic robust initialisation to the registration. Instead of directly registering video and CT, we render a dense set of possible views of the liver from CT and extract liver contour features. To reduce feature maps to lower dimension vectors, we use a deep hashing (DH) network that is trained in a triplet scheme. Registration is obtained by matching the intra-operative image hashing encoding to the closest encodings found in the pre-operative renderings.
Results
We validate our method on synthetic and real data from a phantom and real patient data from eight surgeries. Phantom experiments show that registration errors acceptable for an initial registration are obtained if sufficient pre-operative solutions are considered. In seven out of eight patients, the method is able to obtain a clinically relevant alignment.
Conclusion
We present the first work to adapt DH to the CT to video registration problem. Our results indicate that this framework can effectively replace manual initialisations in multiple views, potentially increasing the translation of these techniques.
... If a critical structure (e.g., a blood vessel or nerve) lies only a few millimeters from the operative field, a 2 cm (20 mm) localization error could lead to unintended injury or an incomplete resection. For instance, laparoscopic tumor resections aim for only~5 mm margins of healthy tissue [40]. ...
Tracking the precise movement of surgical tools is essential for enabling automated analysis, providing feedback, and enhancing safety in robotic-assisted surgery. Accurate 3D tracking of surgical tooltips is challenging to implement when using monocular videos due to the complexity of extracting depth information. We propose a pipeline that combines state-of-the-art foundation models—Florence2 and Segment Anything 2 (SAM2)—for zero-shot 2D localization of tooltip coordinates using a monocular video input. Localization predictions are refined through supervised training of the YOLOv11 segmentation model to enable real-time applications. The depth estimation model Metric3D computes the relative depth and provides tooltip camera coordinates, which are subsequently transformed into world coordinates via a linear model estimating rotation and translation parameters. An experimental evaluation on the JIGSAWS Suturing Kinematic dataset achieves a 3D Average Jaccard score on tooltip tracking of 84.5 and 91.2 for the zero-shot and supervised approaches, respectively. The results validate the effectiveness of our approach and its potential to enhance real-time guidance and assessment in robotic-assisted surgical procedures.
... [cs.CV] 21 Apr 2025 post-processing. However, global registration requires interactive inputs depending on the surgeon's expertise, involving manual adjustments [5], [6] and landmark-based semiautomatic alignment [7]. Later, some approaches [3], [8]- [10] primarily employ landmark-based matching strategy, by detecting landmarks (e.g., ridges, ligaments, and silhouettes) in both 2D intraoperative images and 3D preoperative models, and then recovering the pose parameters by solving the 3D-2D Perspective-n-Point (PnP) problem. ...
Liver registration by overlaying preoperative 3D models onto intraoperative 2D frames can assist surgeons in perceiving the spatial anatomy of the liver clearly for a higher surgical success rate. Existing registration methods rely heavily on anatomical landmark-based workflows, which encounter two major limitations: 1) ambiguous landmark definitions fail to provide efficient markers for registration; 2) insufficient integration of intraoperative liver visual information in shape deformation modeling. To address these challenges, in this paper, we propose a landmark-free preoperative-to-intraoperative registration framework utilizing effective self-supervised learning, termed \ourmodel. This framework transforms the conventional 3D-2D workflow into a 3D-3D registration pipeline, which is then decoupled into rigid and non-rigid registration subtasks. \ourmodel~first introduces a feature-disentangled transformer to learn robust correspondences for recovering rigid transformations. Further, a structure-regularized deformation network is designed to adjust the preoperative model to align with the intraoperative liver surface. This network captures structural correlations through geometry similarity modeling in a low-rank transformer network. To facilitate the validation of the registration performance, we also construct an in-vivo registration dataset containing liver resection videos of 21 patients, called \emph{P2I-LReg}, which contains 346 keyframes that provide a global view of the liver together with liver mask annotations and calibrated camera intrinsic parameters. Extensive experiments and user studies on both synthetic and in-vivo datasets demonstrate the superiority and potential clinical applicability of our method.
... techniques, significantly reducing errors. Pelanis et al. [31] proposed a real-time alignment scheme based on a robotic C-arm capable of updating the liver position and overlaying augmented reality laparoscopically to provide high accuracy within 5 mm. However, the stability and effectiveness of this system still need to be clinically validated. ...
... Tracking efforts primarily focus on two target types: surgical instruments and anatomy. For example, the former is used for skill assessment [1,2], while the latter is necessary for the projection of virtual overlays [3]. ...
Laparoscopic video tracking primarily focuses on two target types: surgical instruments and anatomy. The former could be used for skill assessment, while the latter is necessary for the projection of virtual overlays. Where instrument and anatomy tracking have often been considered two separate problems, in this article, a method is proposed for joint tracking of all structures simultaneously. Based on a single 2D monocular video clip, a neural field is trained to represent a continuous spatiotemporal scene, used to create 3D tracks of all surfaces visible in at least one frame. Due to the small size of instruments, they generally cover a small part of the image only, resulting in decreased tracking accuracy. Therefore, enhanced class weighting is proposed to improve the instrument tracks. The authors evaluate tracking on video clips from laparoscopic cholecystectomies, where they find mean tracking accuracies of 92.4% for anatomical structures and 87.4% for instruments. Additionally, the quality of depth maps obtained from the method's scene reconstructions is assessed. It is shown that these pseudo‐depths have comparable quality to a state‐of‐the‐art pre‐trained depth estimator. On laparoscopic videos in the SCARED dataset, the method predicts depth with an MAE of 2.9 mm and a relative error of 9.2%. These results show the feasibility of using neural fields for monocular 3D reconstruction of laparoscopic scenes. Code is available via GitHub: https://github.com/Beerend/Surgical‐OmniMotion.
... Deformable image registration (DIR) establishes a dense correspondence between two medical image volumes, playing a critical role in tasks such as disease phenotyping [1] and surgical guidance [2]. While DIR is widely used for image matching, its displacement fields lack guarantees of smoothness and invertibility, especially when handling large deformations. ...
Diffeomorphic deformable image registration ensures smooth invertible transformations across inspiratory and expiratory chest CT scans. Yet, in practice, deep learning-based diffeomorphic methods struggle to capture large deformations between inspiratory and expiratory volumes, and therefore lack inverse consistency. Existing methods also fail to account for model uncertainty, which can be useful for improving performance. We propose an uncertainty-aware test-time adaptation framework for inverse consistent diffeomorphic lung registration. Our method uses Monte Carlo (MC) dropout to estimate spatial uncertainty that is used to improve model performance. We train and evaluate our method for inspiratory-to-expiratory CT registration on a large cohort of 675 subjects from the COPDGene study, achieving a higher Dice similarity coefficient (DSC) between the lung boundaries (0.966) compared to both VoxelMorph (0.953) and TransMorph (0.953). Our method demonstrates consistent improvements in the inverse registration direction as well with an overall DSC of 0.966, higher than VoxelMorph (0.958) and TransMorph (0.956). Paired t-tests indicate statistically significant improvements.
... Three-dimensional (3D) reconstruction of surgical scenes has great potential to facilitate many downstream applications including intraoperative navigation [14], and visualization enhancement [15,11]. Besides, a high-quality and dynamic 3D scene model has demonstrated potential benefits to surgical training through D. Shen did this work during his internship at National University of Singapore. ...
... A key component of AR-guided laparoscopic navigation is achieving accurate 3D-to-2D registration, where 3D refers to J. Song pre-operative 3D imaging (segmented as mesh) and live laparoscopic 2D video. Some approaches use external tracking devices with optical markers to perform registration [8], [9]. While these systems offer robust registration, their application is limited due to the invasiveness of marker placement, the complexity and cost of deployment, marker occlusion, and challenges with tracking tissue deformation [10], [11]. ...
A major limitation of minimally invasive surgery is the difficulty in accurately locating the internal anatomical structures of the target organ due to the lack of tactile feedback and transparency. Augmented reality (AR) offers a promising solution to overcome this challenge. Numerous studies have shown that combining learning-based and geometric methods can achieve accurate preoperative and intraoperative data registration. This work proposes a real-time monocular 3D tracking algorithm for post-registration tasks. The ORB-SLAM2 framework is adopted and modified for prior-based 3D tracking. The primitive 3D shape is used for fast initialization of the monocular SLAM. A pseudo-segmentation strategy is employed to separate the target organ from the background for tracking purposes, and the geometric prior of the 3D shape is incorporated as an additional constraint in the pose graph. Experiments from in-vivo and ex-vivo tests demonstrate that the proposed 3D tracking system provides robust 3D tracking and effectively handles typical challenges such as fast motion, out-of-field-of-view scenarios, partial visibility, and "organ-background" relative motion.
... Its clinical integration into intraoperative procedures has been performed, both in open [88][89][90] and MILS [91,92]. The 3D navigation system allows the determination of the exact position of the intraparenchymal targeted lesion [93,94]. Although current data has revealed some advantages of the aforementioned innovations including increased R0 rate, the elevated number of potential treatable liver lesions, precise AR, improved operation time, and minimized blood loss [95], these technologies require additional evidence for official recognition in the future. ...
Hepatocellular carcinoma is the third leading cause of cancer mortality and the sixth most common cancer worldwide, posing a serious global health burden. Liver resection (LR) represents the main form of curative treatment, and it is constantly evolving, along with massive progress in the last 20 years in order to improve the safety of hepatectomy and to broaden the indication of LR. This chapter highlights the recent advances in the surgical management of HCC, including (1) the optimization of future liver remnant (FLR) with portal vein embolization, associating liver partition and portal vein ligation for staged hepatectomy and radiological simultaneous portohepatic vein embolization, (2) the advantages of anatomic LR compared to non-anatomic LR, (3) the minimal invasive liver surgery (MILS) approach via laparoscopic and robotic LR, (4) simulation as well as navigation with three-dimensional liver reconstruction and simulated LR, and application of fluorescence imaging, (5) the utilization of new parenchymal transection devices, and (6) liver transplantation (LT) versus LR. With a deeper understanding of segmental liver anatomy, assistance from simulation and navigation system, advances in FLR optimization, MILS, new parenchymal transection devices, and LT, liver surgeons should tailor the surgical plan according to each individual to achieve the best outcome for patients.