Conference PaperPDF Available

Ultrasound and Fluoroscopic Images Fusion by Autonomous Ultrasound Probe Detection

Conference Paper

Ultrasound and Fluoroscopic Images Fusion by Autonomous Ultrasound Probe Detection

Abstract and Figures

New minimal-invasive interventions such as transcatheter valve procedures exploit multiple imaging modalities to guide tools (fluoroscopy) and visualize soft tissue (transesophageal echocardiography (TEE)). Currently, these complementary modalities are visualized in separate coordinate systems and on separate monitors creating a challenging clinical workflow. This paper proposes a novel framework for fusing TEE and fluoroscopy by detecting the pose of the TEE probe in the fluoroscopic image. Probe pose detection is challenging in fluoroscopy and conventional computer vision techniques are not well suited. Current research requires manual initialization or the addition of fiducials. The main contribution of this paper is autonomous six DoF pose detection by combining discriminative learning techniques with a fast binary template library. The pose estimation problem is reformulated to incrementally detect pose parameters by exploiting natural invariances in the image. The theoretical contribution of this paper is validated on synthetic, phantom and in vivo data. The practical application of this technique is supported by accurate results (< 5 mm in-plane error) and computation time of 0.5s.
Content may be subject to copyright.
adfa, p. 1, 2011.
© Springer-Verlag Berlin Heidelberg 2011
Ultrasound and Fluoroscopic Images Fusion by
Autonomous Ultrasound Probe Detection
Peter Mountney1, Razvan Ionasec1, Markus Kaizer2, Sina Mamaghani1, Wen Wu1,
Terrence Chen1, Matthias John2, Jan Boese2 and Dorin Comaniciu1
1 Siemens Corporate Research & Technology, Princeton, USA
2 Siemens AG, Healthcare Sector, Forchheim, Germany
Abstract. New minimal-invasive interventions such as transcatheter valve pro-
cedures exploit multiple imaging modalities to guide tools (fluoroscopy) and
visualize soft tissue (transesophageal echocardiography (TEE)). Currently, the-
se complementary modalities are visualized in separate coordinate systems and
on separate monitors creating a challenging clinical workflow. This paper pro-
poses a novel framework for fusing TEE and fluoroscopy by detecting the pose
of the TEE probe in the fluoroscopic image. Probe pose detection is challenging
in fluoroscopy and conventional computer vision techniques are not well suited.
Current research requires manual initialization or the addition of fiducials. The
main contribution of this paper is autonomous six DoF pose detection by com-
bining discriminative learning techniques with a fast binary template library.
The pose estimation problem is reformulated to incrementally detect pose pa-
rameters by exploiting natural invariances in the image. The theoretical contri-
bution of this paper is validated on synthetic, phantom and in vivo data. The
practical application of this technique is supported by accurate results (< 5 mm
in-plane error) and computation time of 0.5s.
1 Introduction
Percutaneous and minimally-invasive cardiac procedures are progressively replac-
ing conventional open-heart surgery for the treatment of structural and rhythmological
heart disease. Catheters are used to access target anatomy through small vascular
access ports. This greatly reduces recovery time and the risk of complications associ-
ated with open surgery. Without direct access and visualization, the entire procedure
is performed under imaging guidance. There are two established modalities currently
used in operating rooms to provide real-time intra-operative images: X-ray fluorosco-
py (Fluoro) and transesophageal echocardiography (TEE). Fluoro provides high qual-
ity visualization of instruments and devices, which are typically radiopaque, while
TEE and more recently 3D TEE can image soft-tissue with great detail. Nevertheless,
the complementary nature of TEE and Fluoro is barely exploited in today’s practice
where the real-time acquisitions are not synchronized and images are visualized sepa-
rately in misaligned coordinate systems.
Recently, the fusion of Fluoro and TEE has been proposed using either hardware or
image based methods. Hardware based approaches [1],[2] attach additional devices to
the ultrasound probe such as electromagnetic [1] or mechanical [2] trackers and align
the device and Fluoro coordinates systems through calibration. Image based methods
[3],[4],[5] attempt to use the appearance of the TEE probe in the Fluoro image to
estimate the pose of the probe in the Fluoro coordinate system. These methods are
attractive because they do not require the introduction of additional equipment into
the theatre which may disrupt clinical workflow.
Image based pose estimation is well studied and the problem may be considered
solved when the correspondence between 2D image points and a 3D model are
known. Unfortunately, the appearance of the TEE probe in the Fluoro image makes
establishing the correspondence challenging. The probe’s appearance lacks texture or
clear feature points and can be homogenous under low dose or close to dense tissue.
To alleviate this problem, markers [5] may be retro fitted to the TEE probe. The pose
of the probe is estimated using well established computer vision techniques, however,
the addition of markers increases the overall size of the probe. Alternatively the natu-
ral geometry of the probe may be used to estimate its pose [3],[4]. The authors use a
2D/3D registration technique to refine the probe’s pose estimation and optimal results
are obtained using two biplane images. The method is robust for small pose changes
(10 mm / 10°), however, it requires manual initialization and does not update the reg-
istration in real-time, both of which are important in the clinical setting.
In the paper we propose a robust and fast learning-based method for the automated
detection of the TEE probe pose, with six degrees of freedom, from Fluoro images. A
probabilistic model-based approach is employed to estimate candidates for the in-
plane probe position, orientation and scale parameters. Digitally reconstructed radiog-
raphy (DRR) in combination with a binary template library is introduced for the esti-
mation of out-of-plane rotation parameters (pitch and roll). The approach does not
require manual initialization, is robust over the entire pose parameter space, and inde-
pendent of specific TEE probe design / manufacturer. The performance of the algo-
rithm is demonstrated on a comprehensive dataset of in vivo Fluoro sequences and
validated on simulated and phantom data.
2 Fusion Framework
Information from a TEE volume can be visualized in a Fluoro image by aligning the
TEE and C-arm Fluoro coordinate systems. A point in the ultrasound volume
TEE
Q
can be visualized in the Fluoro image at coordinate
Fluoro
Qusing the following trans-
formation
()
FluoroImage W TEE W
projection xz d TEE TEE
QPRTRRRQT
γα
=+
where
projection
P is the projection matrix,
xz
R
and
d
are the transformation from
detector to world coordinate system,
R
γ
and
R
α
are the angulations of the C-arm and
W
TEE
R
and W
TEE
T
are the rotation and position of the TEE probe in the world coordinate
system such that 111
W FluoroDetector
TEE xz TEE
RRRRR
αγ
−−−
= and 1111
WFluoroDetector
TEE d xz TEE
TRRTRR
αγ
−−−−
=.
The TEE volume and Fluoro image can be aligned if position
(, , )
FluoroDetector
TEE
xyz
T=
and orientation ,,
()
FluoroDetector
TEE
rpy
R
θθ θ
=of the TEE probe are known in the Fluoro
detector coordinate system.
Fig. 1. Detecting the pose of a TEE probe from a single Fluoro image.
2.1 TEE Probe Pose Detection
At the heart of our approach is the separation of the pose parameters into in-plane
(, , )
xyz
and
()
y
θ
and out-of-plane
(, )
rp
θθ
parameters (shown in Fig. 1). By mar-
ginalizing the estimation problem, in-plane parameters can be efficiently estimated
directly from the Fluoro images
Fluoro
I, while being invariant to the out-of-plane pa-
rameters that are more challenging to determine.
The in-plane parameters can be computed from the probe’s position
(,)
uv
, size
()
s
and orientation
()
y
θ
in the Fluoro image, the projection transformation
projection
P
of the Fluoro device and the physical dimensions of the TEE probe. To detect the in-
plane parameters
(,)
uv
,
()
s
,
()
y
θ
from a Fluoro image
Fluoro
I we use discriminative
learning methods as described in the next section.
The out-of-plane parameters are more challenging to estimate. The visual appear-
ance in Fluoro of the probe varies greatly making it challenging to learn a compact
classifier. This requires the problem to be treated in a fundamentally different way. A
template library is created of the probe’s appearance under out-of-plane orientations
(, )
rp
θθ
. Each template has an associated
(, )
rp
θθ
and by matching the Fluoro im-
age to the template the out-of-plane parameters can be estimated.
DetectProbe DetectOrientation DetectScale
DetectRollandPitch:BinaryTem p l a t eLibrary Visualize
InPlane
Parameters
OutofPlane
Parameters
Match
Detecting In-plane Parameters
The in-plane parameters are estimated using discriminative learning methods. A clas-
sifier is trained to detect the position
(,)
uv
, size
()
s
and orientation
()
y
θ
of the TEE
probe in the Fluoro image. The classifiers are trained using manually annotated Fluo-
ro data. They are trained and applied sequentially such that first, candidates are de-
tected for
(,)
uv
, then the orientation
()
y
θ
is detected for each candidate and finally
the size of the probe is detected
()
s
. Each detector is trained using a Probabilistic
Boosting Tree (PBT) with Haar-like and steerable features [6].
The position
(,)
uv
detector is trained on manual annotations and negative exam-
ples taken randomly from the Fluoro image. The Fluoro image is resized to
128 128
×
and a window of
35 35
×
is centered at the annotation. A pool of 100,000 Haar fea-
tures are used to train the PBT. The appearance of the probe varies greatly and to
avoid over fitting a classifier is created which is less discriminative but highly proba-
bly to detect the tip of the probe.
The orientation
()
y
θ
detector is trained on manually annotated data and the false
positives from the position detector. Additional negative training data is created, cen-
tered on the annotation but with incorrect orientation parameters. The PBT is trained
with five features including the relative intensity and the difference between two
steerable filters [6]. The orientation detector is trained at intervals of 6° with 360°
coverage. This detector is more discriminative than the position detector and therefore
removes outliers as well as estimating the orientation.
The size
()
s
detector is trained to detect two points where the tip of the probe
meets the shaft. The PBT is trained using Haar features. During detection the orienta-
tion and position of the probe are used to constrain the search area for the size detec-
tor.
Detecting Out-of-plane Parameters
The appearance of the probe under roll and pitch
(, )
rp
θθ
varies significantly in
the Fluoro image and cannot generally be accounted for in the image space using the
same techniques as the in-plane parameters. The out-of-plane parameters must be
treated in a fundamentally different way. The proposed solution is to build a template
library containing Fluoro images of the probe under different
(, )
rp
θθ
. The
(, )
rp
θθ
parameters are estimated by matching an image patch in
Fluoro
I (normalized for the in-
plane parameters) with the template library.
A comprehensive template library should contain a wide variety of orientations. It
is not feasible to build this library from in vivo data as it is challenging to manually
annotate
(, )
rp
θθ
and the data may not contain complete coverage of the parameter
space. The library is constructed using DRR. DRR’s simulate x-ray Fluoro by tracing
light rays through a 3D volume. In this work a DynaCT of the TEE probe is acquired
(
512 512 488
××
0.2225 mm resolution). The orientation and position of the probe was
manually annotated and
(, )
rp
θθ
rotations are applied to the volume.
Searching a large template library can be computationally expensive. The size of
the library is limited to reduce the search space. The probe is not free to move in all
directions due to the physical constraints of the tissue. In addition the X-ray image,
formulated by integrating light, makes objects appear the same under symmetrical
poses. This is exploited to reduce the size of the template library. The library is built
with pitch
()
p
θ
from -45° to 45° and roll
()
r
θ
from -90° to 90° at 2° intervals. This
subsample library is still large and expensive to store and search. To make the prob-
lem computationally tractable a binary template representation is used [7],[8]. Binary
templates are an efficient way of storing discriminative information for fast matching.
The image patch is divided into sub-regions and features are extracted for each re-
gion. The dominant orientation [7] of the gradient in the sub-region is used as a fea-
ture. This has been shown to work well on homogenous regions and objects which
lack texture as is the case for the TEE probe in the Fluoro image. The orientations are
discretized into 8 orientation bins. Each sub-region can be represented as a single byte
which corresponds to the 8 orientation bins. The bit is set to 1 if the orientation exists
in the sub-region and 0 if it does not. The binary template for the image patch is com-
prised of a set of bytes corresponding to the sub-regions. The resulting template is a
compact and discriminative representation of the image patch.
Input templates extracted from the Fluoro image
()
Fluoro
FIare matched to tem-
plates in the library
()
FO
using
(
)
1
(,,) (,(,))(,)
Fluoro Fluoro
r
IOc FI uvrFOr
εδ
=+=
where
()
δ
is a binary function which returns true if the features in two regions match,
(,(,))
Fluoro
FI uv r
+
is the input template centered on candidate
(,)
uv
in image
Fluoro
I,
(,)
FOr
is a template in the library and
r
is the sub-region. The function counts how
many sub-regions in two templates are the same. The template in the library with the
highest count is taken to be the best match and the associated
(, )
rp
θθ
as the out-of-
plane parameters. This function can be evaluated very quickly using a bitwise AND
operation followed by a bit count enabling the library to be searched efficiently.
3 Results
The proposed method for probe pose detection was validated on synthetic, phan-
tom and in vivo datasets. Throughout our experiments a GE TEE Transducer was
used. The synthetic dataset includes 4050 simulated Fluoro images (DRR) from a 3D
C-arm Volume (DynaCT -
512 512 488
××
0.2225 mm pixel spacing) of the TEE
probe. The ground-truth was generated by annotating the 3D probe position in the
DynaCT volume. The phantom dataset includes a volumetric DynaCT of the TEE
probe inserted into a silicon phantom, and a total of 51 Fluoro (
960 960
×
0.184 mm
pixel spacing) images captured by rotating the C-arm and with the TEE probe remain-
ing static.
Fig. 2. Fluoroscopic images illustrating probe detection and estimation of in-plane parameters.
The position of the C-arm is known from the robotic control, which enabled ground-
truth to be computed for each Fluoro image using the 3D probe annotation. The in
vivo dataset was acquired during several porcine studies and includes 50 Fluoro se-
quences comprising of around 7,000 frames (
512 512
×
0.345 mm pixel spacing). The
data contains images with background clutter, catheter tools and variety in the pose of
the probe, C-arm angulations, dose and anatomy. The pose parameters were manually
annotated in all frames and assumed as ground-truth for training and testing.
In the first experiment the quantitative and qualitative performance evaluation of
the in-plane parameter
(,, )
y
uv
θ
detection was performed on all three datasets. The
detector was trained on 75% of the in vivo dataset (36 sequences of 5,363 frames) and
tested on the entire synthetic, phantom and remaining 25% of the in vivo dataset. The
results are summarized in Table 1.
For the in vivo data the average in-plane position
(,)
uv
error was 2.2 and 3.7 mm
and the in-plane orientation error was 6.69°. Errors in the position estimation are
caused by false detections along the shaft of the probe. False position detections con-
tribute to errors in the orientation estimation. The true positive rate is 0.88 and the
false positive rate is 0.22. The detection and accuracy is affected by dose level, prox-
imity to dense tissue and background clutter. The detection framework performs best
when the probe is clearly distinguishable from its background. Fig. 2 illustrates detec-
tion examples and nature of in vivo images with cluttered background and low tex-
tured probe.
Table 1 Quantitative validation of the in-
plane position
(,)
uv
and orientation
()
y
θ
.
Fig. 3. Error analysis (degrees) of
(, )
rp
θθ
over
search space.
Average Error
Data u (mm)
v(mm)
()
y
θ
Synthetic
1.1 (1.1) 2.2 (3.9) 2.6 (3.2)
Phantom
1.6 (1.4) 2.0 (1.2) 3.0 (3.4)
In Vivo
2.2 (5.1) 3.7 (8.0) 6.6 (16.7)
Fig. 4. Top - Fluoro images showing the detected pose of probe. Bottom: Left- Fluoro image.
Center – mitral valve detected in 3D TEE. Right – Valve model visualized in Fluoro.
The results for the phantom and synthetic data are provided in Table 1 where detec-
tion was performed at a fixed scale. The Fluoro data from the phantom experiment
appears different from the in vivo data used to train the detectors making it challeng-
ing. The true positive rate was 0.95 and false positive rate 0.05. False detections were
caused by the density of the silicon phantom, which obscures the probe in three imag-
es. The true positive and false positive rates for synthetic data were 0.99 and 0.01
respectively. The visual appearance of the synthetic DRR is similar to the training
data and the probe is clearly distinguishable causing high true positive rate.
The out-of-plane
(, )
rp
θθ
detectors are analyzed on the synthetic data to evaluate
the accuracy of the binary template matching. Fig. 3 plots the
(, )
rp
θθ
error over the
search space (degrees) and illustrates stable detection with a single outlier.
The framework is evaluated with respect to all parameters (Table 2). Quantitative
validation was performed on synthetic and phantom data (in vivo ground truth data
was not available). The largest error is in the Z axis, which corresponds to the optical
axis of the Fluoro device. It is expected that this is the largest error because estimating
distance along the optical axis is challenging from a monocular Fluoro image. Fortu-
nately, the goal of the framework is to visualize anatomy in the Fluoro image, there-
fore errors in Z has little effect on the final visualization. Initial clinical feedback
suggests errors of up to 15° and 10 mm (excluding Z) are acceptable for some visuali-
zations, however accuracy requirements are application specific. Qualitative evalua-
tion (Fig. 4 top) is performed on in vivo Fluoro images.
Table 2 Quantitative validation of TEE probe detection.
Error
Data X (mm) Y(mm) Z(mm)
()
r
θ
()
p
θ
()
y
θ
Synthetic 0.82 (0.79) 0.97 (2.1) 64.0(13.9) 4.2 (10.5) 4.6 (9.0) 2.6(3.2)
Phantom 1.1 (0.8) 0.7(0.6) 19.04(1.6) 11.5(12.0) 11.8(9.8) 3.0(3.4)
The computational performance was evaluated (Intel 2.13GHz single core, 3.4GB
RAM). The average detection time is 0.53 seconds. The computational cost can be
reduced by incorporating temporal information to reduce the search space.
To illustrate the clinical relevance of this work an anatomical model of the mitral
valve is detected [9] in 3D TEE and visualized in Fluoro (Fig. 4 bottom). The data is
not synchronized and is manually fused. A catheter is visible in both modalities.
4 Conclusions
This paper presents a novel method for automated fusion of TEE and Fluoro images
to provide guidance for cardiac interventions. The proposed system detects the pose
of a TEE probe in a Fluoro image. Discriminative learning is combined with fast bina-
ry template matching to address the challenges of pose detection. Validation has been
performed on synthetic, phantom and in vivo data. The method is capable of detecting
in 0.5s with an in-plane accuracy of less than 5 mm. Future work will focus on incor-
porating temporal information, using the initial detected pose as a starting estimate for
pose refinement and visualization of anatomically meaningful information.
5 References
1. Jain, A., Gutierrez, L., Stanton, D.: 3D TEE Registration with X-Ray Fluoroscopy for Inter-
ventional Cardiac Applications. FIMH. pp. 321–329 (2009).
2. Ma, Y., Penney, G.P., Bos, D., Frissen, P., Rinaldi, C.A., Razavi, R., Rhode, K.S.: Hybrid
echo and x-ray image guidance for cardiac catheterization procedures by using a robotic arm:
a feasibility study. Physics in Medicine and Biology. 55, 371–382 (2010).
3. Gao, G., Penney, G., Ma, Y., Gogin, N., Cathier, P., Arujuna, A., Morton, G., Caulfield, D.,
Gill, J., Aldo Rinaldi, C., Hancock, J., Redwood, S., Thomas, M., Razavi, R., Gijsbers, G.,
Rhode, K.: Registration of 3D trans-esophageal echocardiography to X-ray fluoroscopy us-
ing image-based probe tracking. Medical Image Analysis. 16, 38–49 (2012).
4. Gao, G., Penney, G., Gogin, N., Cathier, P., Arujuna, A., Wright, M., Caulfield, D., Rinaldi,
A., Razavi, R., Rhode, K.: Rapid Image Registration of 3D Transesophageal Echocardiog-
raphy and X-Ray for Guidance of Cardiac Interventions. IPCAI. pp. 124–134 (2010).
5. Lang, P., Seslija, P., Chu, M.W.A., Bainbridge, D., Guiraudon, G.M., Jones, D.L., Peters,
T.M.: US-Fluoroscopy Registration for Transcatheter Aortic Valve Implantation. IEEE
Transactions on Biomedical Engineering. 59, 1444 –1453 (2012).
6. Wu, W., Chen, T., Wang, P., Zhou, S.K., Comaniciu, D., Barbu, A., Strobel, N.: Learning-
based hypothesis fusion for robust catheter tracking in 2D X-ray fluoroscopy. CVPR. pp.
1097–1104 (2011).
7. Hinterstoisser, S., Lepetit, V., Ilic, S., Fua, P., Navab, N.: Dominant orientation templates for
real-time detection of texture-less objects. CVPR. pp. 2257–2264 (2010).
8. Taylor, S., Drummond, T.: Multiple Target Localisation at over 100 FPS. BMVC (2009).
9. Ionasec, R.I., Voigt, I., Georgescu, B., Wang, Y., Houle, H., Vega-Higuera, F., Navab, N.,
Comaniciu, D.: Patient-specific modeling and quantification of the aortic and mitral valves
from 4-D cardiac CT and TEE. IEEE Trans on Med Imag. 29, 1636–1651 (2010).
... Image fusion between X-ray fluoroscopy and TEE could be also solved by other 2D/3D registration methods [6] [7] or magnetic tracking sensors [8] [9]. Sensor-based solutions may become inaccurate when the electromagnetic field is distorted by large metal objects inside the cardiac catheter laboratory. ...
... Most of the existing methods cannot achieve real-time performance. A maximum of only 0.5 frames per second (FPS) were achieved in [10] and 2 FPS were achieved in [7]. Although 20 FPS was achieved in [11] for TEE probe localization, this method cannot be directly used in our application as it does not calculate out-of-plane rotational parameters. ...
... In [13] [14], real-time catheter and guidewire detection methods were developed. Detection methods based on machine learning algorithms have demonstrated a great potential to solve the 2D/3D registration problem for image fusion between X-ray fluoroscopy and TEE in real-time [7] [11]. Additionally, Hatt et al. [15] developed a Hough forest based detection framework to localize the TEE probe in real-time (17 FPS). ...
Article
Full-text available
3D transesophageal echocardiography (TEE) is one of the most significant advances in cardiac imaging. Although TEE provides real-time three-dimensional (3D) visualization of heart tissues and blood vessels and has no ionizing radiation, X-ray fluoroscopy still dominates in guidance of cardiac interventions due to TEE having a limited field of view and poor visualization of surgical instruments. Therefore, fusing 3D echo with live X-ray images can provide a better guidance solution. This paper proposes a novel framework for image fusion by detecting the pose of the TEE probe in X-ray images in real-time. The framework does not require any manual initialization. Instead it uses a cascade classifier to compute the position and in-plane rotation angle of the TEE probe. The remaining degrees of freedom (DOFs) are determined by fast marching against a binary template library. The proposed framework is validated on phantoms and patient data. The target registration error (TRE) for the phantom was 2.1 mm. In addition, 10 patient datasets, seven of which were acquired from cardiac electrophysiology procedures and three from trans-catheter aortic valve implantation procedures, were used to test the clinical feasibility as well as accuracy. A mean registration error of 2.6 mm was achieved, which is well within typical clinical requirements.
... Machine learning based techniques have also been used (Mountney et al. (2012); Hatt et al. (2015a); Heimann et al. (2014)) but have focused primarily on fully-automatic probe detection rather than 5DOF or 6DOF poseestimation, which is necessary to achieve the accuracy required for most clinical applications. ...
... The mis-registration was chosen from a zero mean normal distribution with standard deviations of 1.5 mm, 1.5 mm, 2.5 mm, 10 • , 10 • , 5 • for the parameters t x , t y , t z , θ x , θ y , θ z , respectively. These values were chosen because they represented common inter-frame changes in pose as well as typical initial mis-alignments reported for automatic probe detection methods (Mountney et al. (2012); Hatt et al. (2015b)). In this study, the accuracy metric used was projection target registration error (pTRE), which was the root-mean-square error between known target points in XRF and estimated target points from echo following registration and projection to the XRF image: ...
Article
Full-text available
In recent years, registration between x-ray fluoroscopy (XRF) and transesophageal echocardiography (TEE) has been rapidly developed, validated, and translated to the clinic as a tool for advanced image guidance of structural heart interventions. This technology relies on accurate pose-estimation of the TEE probe via standard 2D/3D registration methods. It has been shown that latencies caused by slow registrations can result in errors during untracked frames, and a real-time ( > 15 hz) tracking algorithm is needed to minimize these errors. This paper presents two novel similarity metrics designed for accurate, robust, and extremely fast pose-estimation of devices from XRF images: Direct Splat Correlation (DSC) and Patch Gradient Correlation (PGC). Both metrics were implemented in CUDA C, and validated on simulated and clinical datasets against prior methods presented in the literature. It was shown that by combining DSC and PGC in a hybrid method (HYB), target registration errors comparable to previously reported methods were achieved, but at much higher speeds and lower failure rates. In simulated datasets, the proposed HYB method achieved a median projected target registration error (pTRE) of 0.33 mm and a mean registration frame-rate of 12.1 hz, while previously published methods produced median pTREs greater than 1.5 mm and mean registration frame-rates less than 4 hz. In clinical datasets, the HYB method achieved a median pTRE of 1.1 mm and a mean registration frame-rate of 20.5 hz, while previously published methods produced median pTREs greater than 1.3 mm and mean registration frame-rates less than 12 hz. The proposed hybrid method also had much lower failure rates than previously published methods.
... For example, many bone implant surgeries are guided by X-ray fluoroscopy, where realtime recovering of the 6DoF pose of the implant from X-ray images enables computer-aided dynamic planning for implant placement and screw positioning [1]. Another emerging application is the fusion of X-ray and ultrasound images [2]. During interventional procedures guided by fluoroscopy and ultrasound (e.g., transesophageal echocardiography (TEE) and intracardiac echocardiography (ICE)), real-time recovering the 6DoF pose of the ultrasound imaging device from X-ray images enables the live fusion of the 2 image modalities for advanced image guidance. ...
... According to Eqn. (2)(3)(4), the rigid-body transformation T can be parameterized by 6 parameters (x, y, m, r x , r y , r z ). The translation, magnification and in-plane rotation parameters (x, y, m, r z ) largely affect the position, size and orientation of the object in the DRR, and hence are referred to as geometryirrelevant transformations/parameters. ...
Article
Purpose: Real-time 6 degrees of freedom (6DoF) pose recovery and tracking from X-ray images is a key enabling technology for many interventional imaging applications. However, real-time 2D/3D registration is a very challenging problem because of the heavy computation in iterative digitally reconstructed radiograph (DRR) generation. In this paper, we propose a real-time 2D/3D registration framework using library-based DRRs to achieve high computational efficiency. Method: The proposed method pre-computes a library of canonical DRRs and reconstructs library-based DRRs (libDRRs) during registration without online rendering. The transformation parameters are decoupled to 2 geometry-relevant and 4 geometry-irrelevant ones so that canonical DRRs only need to cover the variation of 2 geometry-relevant parameters, making it practical to be pre-computed and stored. The 2D/3D registration using libDRRs is then solved as a hybrid optimization problem, i.e., continuous in geometry-irrelevant parameters while discrete in geometry-relevant parameters. Results: On 5 fluoroscopic sequences with 246 frames acquired during animal studies with a transesophageal echocardiography (TEE) probe in the field of view, 6DoF tracking of the TEE probe using the proposed method achieved a mean target registration error in the projection direction (mTREproj) of 0.81 mm, a success rate of 100 % (defined as mTREproj [Formula: see text]2.5 mm), and a registration frame rate of 23.1 fps on a pure CPU-based implementation executed in a single thread. Conclusion: Using libDRRs with a hybrid optimization can significantly improve the computational efficiency (up to tenfold) for 6DoF pose recovery and tracking with little degradation in robustness and accuracy, compared to conventional intensity-based 2D/3D registration using ray casting DRRs with a continuous optimization.
... X-ray / ultrasound registration fusion combines the information from both modalities and display the images in a common coordinate system by tracking the position and orientation of the ultrasound probe with respect to the angiography system. Commercially available fusion systems are restricted to 2D visualization of the valve [7], [8] and provide a 2D overlay of the x-ray images with a projection of the real-time ultrasound data. To assure correct placement of the valve during the procedure, the gantry position might have to be repositioned frequently to evaluate the valve position from different angles. ...
... X-ray / ultrasound registration fusion combines the information from both modalities and display the images in a common coordinate system by tracking the position and orientation of the ultrasound probe with respect to the angiography system. Commercially available fusion systems are restricted to 2D visualization of the valve [7], [8] and provide a 2D overlay of the x-ray images with a projection of the real-time ultrasound data. To assure correct placement of the valve during the procedure, the gantry position might have to be repositioned frequently to evaluate the valve position from different angles. ...
... Ultrasound-driven shape reconstruction also suffers from heavy computation and low accuracy, which is substantially lower than using fluoroscopy images for shape reconstruction [35]. To address these concerns, the fusion of ultrasound and fluoroscopic imaging modalities has been proposed to guide the trans-catheter aortic valve implantation (TAVI) and cardiac catheterization procedures, due to their complementary properties [148][149][150]. X-ray fluoroscopy provides 2D high contrast imaging of continuum robots, while ultrasound supports depth information to visualize them and tissue. ...
Article
Continuum robots provide inherent structural compliance with high dexterity to access the surgical target sites along tortuous anatomical paths under constrained environments, and enable to perform complex and delicate operations through small incisions in minimally invasive surgery. These advantages enable their broad applications with minimal trauma, and make challenging clinical procedures possible with miniaturized instrumentation and high curvilinear access capabilities. However, their inherent deformable designs make it difficult to realize three-dimensional (3D) intraoperative real-time shape sensing to accurately model their shape. Solutions to this limitation can lead themselves to further develop closely associated techniques of closed-loop control, path planning, human–robot interaction and surgical manipulation safety concerns in minimally invasive surgery. Although extensive model-based research that relies on kinematics and mechanics has been performed, accurate shape sensing of continuum robots remains challenging, particularly in cases of unknown and dynamic payloads. This survey investigates the recent advances in alternative emerging techniques for 3D shape sensing in this field, and focuses on the following categories: fiber optic sensors based, electromagnetic tracking based and intraoperative imaging modalities based shape reconstruction methods. The limitations of existing technologies and prospects of new technologies are also discussed.
Conference Paper
Transesophageal Echocardiography (TEE) and X-Ray fluoroscopy are two routinely used real-time image guidance modalities for interventional procedures, and co-registering them into the same coordinate system enables advanced hybrid image guidance by providing augmented and complimentary information. In this paper, we present an image-based system of co-registering these two modalities through real-time tracking of the 3D position and orientation of a moving TEE probe from 2D fluoroscopy images. The 3D pose of the TEE probe is estimated fully automatically using a detection based visual tracking algorithm, followed by intensity-based 3D-to-2D registration refinement. In addition, to provide high reliability for clinical use, the proposed system can automatically recover from tracking failures. The system is validated on over 1900 fluoroscopic images from clinical trial studies, and achieves a success rate of 93.4 % at 2D target registration error (TRE) less than 2.5 mm and an average TRE of 0.86 mm, demonstrating high accuracy and robustness when dealing with poor image quality caused by low radiation dose and pose ambiguity caused by probe self-symmetry.
Chapter
Symptomatic severe aortic stenosis portends a dismal prognosis without replacement of the aortic valve. For nearly half of a century, surgical aortic valve replacement (SAVR) was the mainstay of therapy. Transcatheter aortic valve replacement (TAVR) has more recently established itself as a highly-effective treatment for patients with symptomatic severe aortic stenosis. Currently indicated for patients who are at high or extreme risk for death or major morbidity with surgical aortic valve replacement, TAVR is likely to become a therapeutic option for intermediate and low surgical risk patients as well. Cardiac imaging plays a central role in achieving success with TAVR and is critical in pre-procedural evaluation, intra-procedural guidance, and post-procedural assessment.
Chapter
In this chapter, we adapt the Marginal Space Learning to the Left Ventricle (LV) detection in 2D Magnetic Resonance Imaging (MRI) . Furthermore, we compare the performance of the MSL and Full Space Learning. A thorough comparison experiment on the LV detection in MRI images shows that the MSL significantly outperforms the FSL, in both speed and accuracy.
Chapter
Marginal Space Learning (MSL) has been extensively tested on multiple anatomical structure detection and segmentation problems in different medical imaging modalities. In this chapter, we provide a review of its applications in the published literature. We first review applications on “pure” detection problems, followed by those combining detection, segmentation, and tracking.
Conference Paper
Full-text available
Catheter tracking has become more and more important in recent interventional applications. It provides real time navigation for the physicians and can be used to control a motion compensated fluoro overlay reference image for other means of guidance, e.g. involving a 3D anatomical model. Tracking the coronary sinus (CS) catheter is effective to compensate respiratory and cardiac motion for 3D overlay navigation to assist positioning the ablation catheter in Atrial Fibrillation (Afib) treatments. During interventions, the CS catheter performs rapid motion and non-rigid deformation due to the beating heart and respiration. In this paper, we model the CS catheter as a set of electrodes. Novelly designed hypotheses generated by a number of learning-based detectors are fused. Robust hypothesis matching through a Bayesian framework is then used to select the best hypothesis for each frame. As a result, our tracking method achieves very high robustness against challenging scenarios such as low SNR, occlusion, foreshortening, non-rigid deformation, as well as the catheter moving in and out of ROI. Quantitative evaluation has been conducted on a database of 13221 frames from 1073 sequences. Our approach obtains 0.50mm median error and 0.76mm mean error. 97.8% of evaluated data have errors less than 2.00mm. The speed of our tracking algorithm reaches 5 frames-per-second on most data sets. Our approach is not limited to the catheters inside the CS but can be extended to track other types of catheters, such as ablation catheters or circumferential mapping catheters.
Conference Paper
Full-text available
We present a method for real-time 3D object detection that does not require a time consuming training stage, and can handle untextured objects. At its core, is a novel tem- plate representation that is designed to be robust to small image transformations. This robustness based on dominant gradient orientations lets us test only a small subset of all possible pixel locations when parsing the image, and to rep- resent a 3D object with a limited set of templates. We show that together with a binary representation that makes eval- uation very fast and a branch-and-bound approach to effi- ciently scan the image, it can detect untextured objects in complex situations and provide their 3D pose in real-time.
Article
Full-text available
As decisions in cardiology increasingly rely on noninvasive methods, fast and precise image processing tools have become a crucial component of the analysis workflow. To the best of our knowledge, we propose the first automatic system for patient-specific modeling and quantification of the left heart valves, which operates on cardiac computed tomography (CT) and transesophageal echocardiogram (TEE) data. Robust algorithms, based on recent advances in discriminative learning, are used to estimate patient-specific parameters from sequences of volumes covering an entire cardiac cycle. A novel physiological model of the aortic and mitral valves is introduced, which captures complex morphologic, dynamic, and pathologic variations. This holistic representation is hierarchically defined on three abstraction levels: global location and rigid motion model, nonrigid landmark motion model, and comprehensive aortic-mitral model. First we compute the rough location and cardiac motion applying marginal space learning. The rapid and complex motion of the valves, represented by anatomical landmarks, is estimated using a novel trajectory spectrum learning algorithm. The obtained landmark model guides the fitting of the full physiological valve model, which is locally refined through learned boundary detectors. Measurements efficiently computed from the aortic-mitral representation support an effective morphological and functional clinical evaluation. Extensive experiments on a heterogeneous data set, cumulated to 1516 TEE volumes from 65 4-D TEE sequences and 690 cardiac CT volumes from 69 4-D CT sequences, demonstrated a speed of 4.8 seconds per volume and average accuracy of 1.45 mm with respect to expert defined ground-truth. Additional clinical validations prove the quantification precision to be in the range of inter-user variability. To the best of our knowledge this is the first time a patient-specific model of the aortic and mitral valves is automatically estimated from volumetric sequences.
Article
Transcatheter aortic valve implantation is a minimally invasive alternative to open-heart surgery for aortic stenosis in which a stent-based bioprosthetic valve is delivered into the heart on a catheter. Limited visualization during this procedure can lead to severe complications. Improved visualization can be provided by live registration of transesophageal echo (TEE) and fluoroscopy images intraoperatively. Since the TEE probe is always visible in the fluoroscopy image, it is possible to track it using fiducial-based single-perspective pose estimation. In this study, inherent probe tracking performance was assessed, and TEE to fluoroscopy registration accuracy and robustness were evaluated. Results demonstrated probe tracking errors of below 0.6 mm and 0.2°, a 2-D RMS registration error of 1.5 mm, and a tracking failure rate of below 1%. In addition to providing live registration and better accuracy and robustness compared to existing TEE probe tracking methods, this system is designed to be suitable for clinical use. It is fully automatic, requires no additional operating room hardware, does not require intraoperative calibration, maintains existing procedure and imaging workflow without modification, and can be implemented in all cardiac centers at extremely low cost.
Conference Paper
The recent availability of three-dimensional (3D) transesophageal echocardiography (TEE) provides cardiologists with real-time D imaging of cardiac anatomy. X-ray fluoroscopy is the conventional modalilty that is used for guiding many cardiac interventions. Increasingly this is now supported using intra-procedure 3D TEE imaging. We hypothesize that the real-time coregistration and visualization of 3D TEE and X-ray fluoroscopy data will provide a powerful guidance tool for cardiologists. In this paper, we propose a novel, robust and efficient method for performing this registration. Our method consists of an image-based TEE probe localization algorithm and a calibration procedure. While the calibration needs to be done only once, the registration takes approximately 9.5 seconds to complete. The accuracy of our method was assessed by using both a crosswire phantom and a more realistic heart phantom. The target registration error for the heart phantom was less than 2mm. In addition, the accuracy and the clinical feasiblity of our method was evaluated in two cardiac electrophysiology procedures. The registration results showed inplane errors of 1.5 and 3mm.
Conference Paper
Live 3D trans-esophageal echocardiography (TEE) and X-ray fluoroscopy provide complementary imaging information for guiding minimally invasive cardiac interventions. X-ray fluoroscopy is most commonly used for these procedures due to its excellent device visualization. However, its challenges include the 2D projection nature of the images and poor soft tissue contrast, both of which are addressed by the use of live 3D TEE imaging. We propose to integrate 3D TEE imaging with X-ray fluoroscopy, providing the capability to co-visualize both the interventional devices and cardiac anatomy, by accurately registering the images using an electro-magnetic tracking system. Phantom trials validating the proposed registration scheme indicate an average accuracy of 2.04 mm with a standard deviation of 0.59 mm. In the future, this system may benefit the guidance and navigation of interventional cardiac procedures such as mitral valve repair or patent foramen ovale closure.
Conference Paper
This paper presents a method for fast feature-based matching which enables 7 in- dependent targets to be localised in a video sequence with an average total processing time of 7.46ms per frame. We extend recent work (14) on fast matching using His- togrammed Intensity Patches (HIPs) by adding a rotation invariant framework and a tree- based lookup scheme. Compared to state-of-the-art fast localisation schemes (15) we achieve better matching robustness in under a quarter of the computation time and re- quiring 5-10 times less memory.
Article
Two-dimensional (2D) X-ray imaging is the dominant imaging modality for cardiac interventions. However, the use of X-ray fluoroscopy alone is inadequate for the guidance of procedures that require soft-tissue information, for example, the treatment of structural heart disease. The recent availability of three-dimensional (3D) trans-esophageal echocardiography (TEE) provides cardiologists with real-time 3D imaging of cardiac anatomy. Increasingly X-ray imaging is now supported by using intra-procedure 3D TEE imaging. We hypothesize that the real-time co-registration and visualization of 3D TEE and X-ray fluoroscopy data will provide a powerful guidance tool for cardiologists. In this paper, we propose a novel, robust and efficient method for performing this registration. The major advantage of our method is that it does not rely on any additional tracking hardware and therefore can be deployed straightforwardly into any interventional laboratory. Our method consists of an image-based TEE probe localization algorithm and a calibration procedure. While the calibration needs to be done only once, the GPU-accelerated registration takes approximately from 2 to 15s to complete depending on the number of X-ray images used in the registration and the image resolution. The accuracy of our method was assessed using a realistic heart phantom. The target registration error (TRE) for the heart phantom was less than 2mm. In addition, we assess the accuracy and the clinical feasibility of our method using five patient datasets, two of which were acquired from cardiac electrophysiology procedures and three from trans-catheter aortic valve implantation procedures. The registration results showed our technique had mean registration errors of 1.5-4.2mm and 95% capture range of 8.7-11.4mm in terms of TRE.
Article
We present a feasibility study on hybrid echocardiography (echo) and x-ray image guidance for cardiac catheterization procedures. A self-tracked, remotely operated robotic arm with haptic feedback was developed that attached to a standard x-ray table. This was used to safely manipulate a three-dimensional (3D) trans-thoracic echo probe during simultaneous x-ray fluoroscopy and echo acquisitions. By a combination of calibration and tracking of the echo and x-ray systems, it was possible to register the 3D echo images with the 2D x-ray images. Visualization of the combined data was achieved by either overlaying triangulated surfaces extracted from segmented echo data onto the x-ray images or by overlaying volume rendered 3D echo data. Furthermore, in order to overcome the limited field of view of the echo probe, it was possible to create extended field of view (EFOV) 3D echo images by co-registering multiple tracked echo data to generate larger roadmaps for procedure guidance. The registration method was validated using a cross-wire phantom and showed a 2D target registration error of 3.5 mm. The clinical feasibility of the method was demonstrated during two clinical cases for patients undergoing cardiac pacing studies. The EFOV technique was demonstrated using two healthy volunteers.