PreprintPDF Available

BIRL: Benchmark on Image Registration methods with Landmark validation

Authors:
Preprints and early-stage research may not have been peer reviewed yet.

Abstract and Figures

This report presents a generic image registration benchmark with automatic evaluation using landmark annotations. The key features of the BIRL framework are: easily extendable, performance evaluation, parallel experimentation, simple visualisations, experiment’s time-out limit, resuming unfinished experiments. From the research practice, we identified and focused on these two main use-cases: (a) comparison of user’s (newly developed) method with some State-of-the-Art (SOTA) methods on a common dataset and (b) experimenting SOTA methods on user’s custom dataset (which should contain landmark annotation). Moreover, we present an integration of several standard image registration methods aiming at biomedical imaging into the BIRL framework. This report also contains experimental results of these SOTA methods on the CIMA dataset, which is a dataset of Whole Slice Imaging (WSI) from histology/pathology containing several multi-stain tissue samples from three tissue kinds. Source and results: https://borda.github.io/BIRL
Content may be subject to copyright.
BIRL: BENCHMARK ON IMAGE REGISTRATION METHODS
WITH LANDMARK VALIDATION
A PREPRINT
Jiri Borovec
FEE, Czech Technical University in Prague
jiri.borovec@fel.cvut.cz
February 2, 2020
ABS TRAC T
This report presents a generic image registration benchmark with automatic evaluation using land-
mark annotations. The key features of the BIRL framework are: easily extendable, performance
evaluation, parallel experimentation, simple visualisations, experiment’s time-out limit, resuming
unfinished experiments. From the research practice, we identified and focused on these two main
use-cases: (a) comparison of user’s (newly developed) method with some State-of-the-Art (SOTA)
methods on a common dataset and (b) experimenting SOTA methods on user’s custom dataset
(which should contain landmark annotation).
Moreover, we present an integration of several standard image registration methods aiming at
biomedical imaging into the BIRL framework. This report also contains experimental results of
these SOTA methods on the CIMA dataset, which is a dataset of Whole Slice Imaging (WSI) from
histology/pathology containing several multi-stain tissue samples from three tissue kinds.
Source and results: https://borda.github.io/BIRL
Keywords Image registration ·Benchmark ·Landmark annotation ·Biomedical imaging ·Stain histology
1 Introduction
The image registration is a crucial task in several domains, although this report focuses mainly on biomedical image
registration [17] and in particular Whole Slice Imaging (WSI) in histology/pathology [8], but it can be easily reused
also in other domains such as material analyses, surveillance, etc.
In digital pathology, [9], one of the most simple and yet most useful features is the ability to view serial sections
of tissue simultaneously on a computer screen [1014]. This enables the pathologist to evaluate the histology and
expression of multiple markers for a patient in a single review [15]. However, the rate-limiting step in this process is
the time taken for the pathologist to open each individual image, align the sections within the viewer, and then manually
move around the section. In addition, due to tissue processing and pre-analytical steps, sections with different stains
have non-linear variations among their acquisitions; also they may stretch and change shape from section to section,
or some tissue fraction may be damaged or missing. [1620]
It is generally known that the WSI image registration is not a well-solved problem compare to other single-modal tasks
like MRI, CT or ultrasound. The multi-stain WSI registration can be assumed to be an almost multi-modal problem
since the variety of used stains dramatically change appearance model, and the deformation may range from fine
elastic transformation to completely missing section (due to mechanical processes / sample preparations) and flopping
samples while image sensing. [8,19, 21, 22] In recent years we notice a steady number of papers aiming at WSI image
registration, but quite often they miss fair comparison of their newly developed method with well-established methods
for the particular domain. [23] This became a primary motivation to collect and annotated a histology dataset of WSI
microscopy images and develop an image registration evaluation framework to fill this gap.
BIRL: Benchmark on Image Registration methods with Landmark validation A PREPRINT
In the past, there was a few image registration benchmarks [10,2427] and challenges [11, 28]. Typically, they were
focusing on a very narrow domain (single-modal) and using relatively small images (a thousand pixels compare to
typical WSI with tens of thousands of pixels in image diagonal). The limitation of the last attend [28] to democratise
image registration benchmarking is the need of tight integration to their framework (which may lead even to a time-
costly rewriting whole method) compare to our BIRL which can run with almost anything.
Let us briefly talk about the context and history of this work. First, we introduced a benchmark [23] comparing several
methods on quite small images up to 5k pixels (with respect to WSI sizes) where not all images were allowed to be
public. Later the BIRL framework was redesigned, and we introduced an ANHIR1[29] also challenge with some
hidden/private data for method’s evaluation. Finally, we present only publicly available image registration methods
experimented on a publicly available CIMA dataset.
2 BIRL framework
The BIRL is a light-weighted Python framework for easy image registration experimentation and benchmarking on
landmarks-like annotated datasets.
Let us summarise the main/key features of this framework:
automatic executing image registration on a sequence of image pairs
integrated evaluation of registration performances using relative Target Registration Error (rTRE)
integrated visualisations of performed registrations
running several image registration experiments in parallel
resuming unfinished sequence of registration benchmark
creating/handling custom dataset and design its own experiments
utilising basic image pre-processing, e.g. colour normalising
stand-alone post-processing: evaluation and visualisation for finished benchmarks
Moreover, the framework is developed as an installable package so any particular functionality can be reused in other
similar projects, for example calling the same metrics or handling datasets.
2.1 Benchmark workflow
Then core benchmark class is designed to be inherited and just needed experimental calls to be overwritten in a (child)
class for a specific image registration method. In particular, the benchmark workflow is the following:
1. preparing the experiment’s environment, e.g. create experiment folder, copy configurations, etc.
2. loading required data - parsing the experiment’s image/landmarks pairs;
3. performing the benchmark sequence (optionally in parallel) and save particular results (registration outputs
and statistic) to a common table and optionally create a visualisation of performed experiments;
4. evaluating results overall performed experiments;
5. summarising and exporting results from the complete benchmark.
2.2 Metrics
There are a few evaluation approaches which are derived according to available datasets’ annotations. The most
common and less time demanding to obtain adequate annotation is using landmarks and measure their distances [10,
2325, 30], another option is using image segmentation and measure its overlap [11, 26, 27]. The precision of both
measures is sensitive on the landmarks density or details in segmentation (meaning the number of utilised classes and
degree of detail captured) respectively.
As it was stated in the title, we use landmark-based evaluation. Assume we have have set of landmarks (key points
marking uniquely identifiable structures) LFand LMin fixed IFand moving IMimage respectively and particular
landmarks xFLFand xMLMmarking the same biological structure in both images. Moreover, we have a set
of warped landmarks ˆ
x
Mfrom xMto match xF.
1https://anhir.grand-challenge.org
2
BIRL: Benchmark on Image Registration methods with Landmark validation A PREPRINT
Figure 1: Sample visualisation of TRE for wrongly (top), partially well (middle) and fine (bottom) registered image
pairs. The lines represent the relation between particular landmarks in moving/fixed/warped images: (green) the true
mapping, (blue) estimated mapping and (red) TRE.
The evaluation is based on Target Registration Error (TRE) between two sets of landmarks xFand ˆ
x
Min the two
images. Lets us denote
T RE =de(xF
l,ˆ
x
M
l)(1)
where xFand ˆ
x
Mare the associated landmark coordinates and de(.)is the Euclidean distance. As image sizes across
datasets differ making an overall comparison uneasy, all TRE are normalised by the image diagonal of the fixed image
IFto a relative TRE (rTRE),
rT RE =T R E
diagonal(IF)(2)
As a successful-rate-like measure, we introduce robustness Ras a relative value describing how many landmarks L
improved its TRE by performed image registration compared to the initial TRE, otherwise, formally
R=1
|L|X
lL
[[T RE(xF
l,ˆ
x
M
l)< T RE(xF
l,xM
l)]] (3)
All failing image registrations are considered to have the initial position ˆ
x
M xM.
3
BIRL: Benchmark on Image Registration methods with Landmark validation A PREPRINT
Let us also denote following aggregation measures over single experiment (registration image pair): mi(.) =
medianimage(.)and si(.) = maximage (.); and over the dataset ad(.) = meandataset(.),md(.) = mediandataset (.), which
are used in a cascade - meaning first compute measures over particular images then aggregated over whole dataset,
for example ad(mi(rT RE (xF
l,ˆ
x
M
l))) which is later shorten to AM rT RE. The motivation for using the median
is a lower penalisation for a few inaccurate landmarks while most of them are registered well. These statistics also
assumes a uniform distribution of landmarks annotation over examining tissue sample.
2.3 Typical use-cases
We have identified the two most common use-cases for this framework - how a user can most benefits from this work
with minimal effort.
Comparing with SOTA on a common dataset. The quite common problem with newly developed methods is
presenting only their results on a private custom dataset, which may have a pure description, moreover, usually the
dataset is small. This missing comparison with SOTA methods on the standard (well described) dataset can be fixed by
integrating the new method to the BIRL framework, run the benchmark and compare its new results with the presented
scores. For this case user need to overwrite an only fraction of methods/functions which is essential for their image
registration method and its evaluation:
_prepare_img_registration(...) using if some extra preparation before running own image registration is
needed, e.g. converting images to different format. [before each image registration experiment]
_execute_img_registration executing/performing the image registration, time of this method is measured
as execution time; in case user calls external method from a command line, he rewrites only _gener-
ate_regist_command(...) which prepares the registration command to be executed automatically. [core of
each image registration experiment]
_extract_warped_image_landmarks(...) extracting the required warped landmarks or perform landmark
warping in this stage if it was not already done as a part of the image registration. [after each image reg-
istration experiment]
_clear_after_registration(...) removing some temporary files generated during image registration to keep the
results lightweight. [after each image registration experiment]
Exploring SOTA methods on a custom dataset. This is another practical use-case, especially for biomedical ex-
perts starting with a discovery on a new dataset. We suppose the user may ask a question: "What is the best method
for aligning my new images together?" Then he can prepare dataset pairing - CSV table with rows containing paths
to the source and target image and their landmarks annotation. If a performance evaluation/analyses are not required,
and a simple visual evaluation is sufficient, the landmark annotation can be omitted.
3 Experiments
With the framework in our hands, we added some State-Of-The-Art (SOTA) methods and ran them on a prepared
dataset. We start with a brief description of integrated SOTA methods, followed by recapitulation on CIMA dataset and
finishing with presenting the results of the SOTA methods on this WSI dataset (also showing visualisations produced
by this framework).
3.1 Standard image registration methods
There are many standards and/or widely used methods/software/frameworks for biomedical image registration. We
stick only with publicly available ones and those who are capable of producing warped landmarks based on estimated
transformation for our performance evaluation.
Advanced Normalisation Tools (ANTs) [31] is a registration toolkit using Insight Segmentation and Registration
Toolkit (ITK)2as a backend aiming at MR imaging. The ANTs allows creating custom image registration pipeline
composed of several transformations and similarity measures in a multi-scale scheme. [In our experiments, we used
a combination of an affine registration with Mattes Mutual Information (MMI) followed by SyN registration with
Cross-Correlation (CC) similarity measure.]
2https://itk.org
4
BIRL: Benchmark on Image Registration methods with Landmark validation A PREPRINT
Code Name
Cc10 Clara cell 10 protein
CD31 Platelet endothelial cell adhesion molecule
ER Estrogen receptor
H&E Hematoxylin and Eosin
HER-2 Human epidermal growth factor receptor 2
Ki67 Antigen KI-67
PR Progesterone receptor
proSPC Prosurfactant protein C
Table 1: Complete list of biological markers used for staining histology tissue samples.
bUnwarpJ [32] is a ImageJ/Fiji [33] plugin which estimates a symetric non-linear B-spline-based deformation. The
minimised criterion is a sum of squares difference (SSD) in a multi-resolution scheme. [We have experimented two
versions: (bUnwarpJ) using only the image intensities and (bUnwarpJ+SIFT) combining image intensity with SIFT
detector equally balancing both fractions.]
DROP [34, 35] differs from most other methods by using discreet optimisation (solving efficiently in a multiresolu-
tion fashion using linear programming) for minimising a sum of absolute differences (SAD) criterion.
Elastix [36] is an image registration software base on the ITK offering several transformations/metrics/optimisations
in multi-resolution scheme. [In our experiments, we used b-spline image registration with Adaptive Stochastic Gradi-
ent Descent minimising Advanced MMI criterion.]
NiftyReg [37, 38] is an open-source software performing linear and/or nonlinear registration for two and three di-
mensional images. The linear registration is based on a block-matching technique, and the non-linear is using the
Free-Form Deformation. [In our experiments, we used R wrapper to this software and defined two-step registration -
starting with linear and followed by non-linear.]
Register Virtual Stack Slices (RVSS) [32] is another ImageJ/Fiji registration plugin which extends the bUnwarpJ
and relies on SIFT [39] feature points offering several deformation types. [In our experiments, we used similarity for
feature matching and affine for the final alignment.]
Note that methods allowing other configuration of transformation, metric, optimisation, etc. can be easily reconfigured
within this benchmark (with a simple parameter file) without any code changes.
3.2 CIMA dataset
The dataset3[40] consists WSI of microscopy images of 2D pathological tissue slices, stained with different markers
(for stain explanation see Tab. 1), and landmarks (manually annotated uniquely unidentified biological structure)
in each slice. The main challenges for these images are the following: (i) enormous image size, (ii) appearance
differences, and (iii) lack of distinctive appearance objects. In particular, this dataset contains nine tissue samples of
three different tissue types which together form 108 image pairs.
Let us start with a short description of the particular tissue samples, followed by landmarks and forming registration
image pairs. For the dataset overview, see Tab. 2.
Brief description of the three tissue types included in the CIMA datasets:
Lung lesion Unstained adjacent 3µm formalin-fixed paraffin-embedded sections were cut from the blocks and
stained with H&E or by immunohistochemistry with a specific antibody for CD31, proSPC, CC10 or Ki67.
Images of three mice lung lesions (adenoma or adenocarcinoma) were acquired with a Zeiss Axio Imager M1
microscope (Carl Zeiss, Jena, Germany) equipped with a dry Plan Apochromat objective (numerical aperture
NA = 0.95, magnification 40×, pixel size 0.174µm/pixel).
Lung lobes The images of the four whole mice lung lobes correspond to the same set of histological samples
as the lesion tissue. They were also acquired with a Zeiss Axio Imager M1 microscope (Carl Zeiss, Jena,
Germany) equipped with a dry EC Plan-Neofluar objective (NA = 0.30, magnification 10×, pixel size
1.274µm/pixel).
3http://cmp.felk.cvut.cz/~borovji3/?page=dataset
5
BIRL: Benchmark on Image Registration methods with Landmark validation A PREPRINT
Name µm/pixel images
in set
points
per image
Avg. size
[pixels]
10k zoom
[%]
Lung Lesions 1 0.174 5 78 16k50
Lung Lesions 2 0.174 5 101 23k25
Lung Lesions 3 0.174 5 80 16k50
Lung lobes 1 1.274 5 98 10k100
Lung lobes 2 1.274 5 107 10k100
Lung lobes 3 1.274 5 80 9k100
Lung lobes 4 1.274 5 86 9k100
Mammary gland 1 2.294 5 82 22k25
Mammary gland 2 2.294 8 76 20k25
Table 2: Dataset summary per tissue sample with used scales for 10kdataset scale.
Cc10 CD31 H&E Ki67 proSPC
Cc10 - 1 2 3 4
CD31 1 - 5 6 7
H&E 2 5 - 8 9
Ki67 3 6 8 - 10
proSPC 4 7 9 10 -
Table 3: An example of forming 10 registration pairs from a set of 5 differently stained images. The equal pairs
indexes in upper-right and lower-left parts mark the mirroring pairs where the strike ones are omitted.
Mammary glands The sections are cuts from two mammary glands blocks stained with H&E (even sections)
and alternatively, with an antibody against the ER, PR, or Her2-neu (odd sections). They were also acquired
with a Zeiss Axio Imager M1 microscope (Carl Zeiss, Jena, Germany) equipped with a dry EC Plan-Neofluar
objective (NA = 0.30, magnification 10×, pixel size 1.274µm/pixel).
Landmarks The landmarks as mentioned earlier are points marking unique biological structures appearing in all
images (tissue slice) of a single tissue sample where each slice is stain by a different marker. The landmarks for each
image (stain tissue slice) are stored in a table (a CSV file) where the lines match the same biological structures across
stain slices. This format is very intuitive, and it has a standard ImageJ structure, and coordinate frame - the origin
(0,0) of the coordinate system is set to the image top left corner. On the side of annotation quality, each landmark
annotation was performed by two experts, and each annotation was validated by another expert independently. Further
information about landmarks and annotation procedure is available on web4along with annotation tools which allow
to add new landmarks and share them among all dataset users.
Pairing images As the stain slices within each tissue sample are very close each to other and we have unified
annotations over all slices in a set, we register all image (slices) to each other which increase the number of registration
pairs, see Tab. 3. We assume that the registration of two images (fixed and moving) is symmetric IM IFand
IM IF, so we drop mirroring pairs.
3.3 Experimental setting
We run experiments on Linux server with 24 CPU and 250GB RAM. As its performance is above standard machine,
the framework uses an option to normalise execution time to any other reference (define) machine. Moreover, we can
run four experiments in parallel.
The dataset allows using scales. We experiment on WSI (denoted as "full") and also a mix size images close to 10k
pixels in image diagonal (denoted as "10k"), the same as ANHIR challenge did (to see particular scales used in 10k
see Tab. 2).
We set hard time-out limit 3hours per single image registration; if a method does not finish in the frame time, it is
terminated and considered as a failure case.
4https://borda.github.io/dataset-histology- landmarks
6
BIRL: Benchmark on Image Registration methods with Landmark validation A PREPRINT
methods scope Median rTRE [%] Max rTRE [%] Robustness [%] time [min]
Avg. ±STD Median Avg. ±STD Avg. ±STD Median Average ±STD
10k 2.30 ±2.00 1.67 5.56 ±3.66 79.02 ±24.82 89.25 52.17 ±26.89
ANTs full 3.85 ±2.88 3.47 7.36 ±4.34 30.35 ±41.13 0.00 210.46 ±106.41
10k 2.82 ±2.08 3.00 6.66 ±3.66 74.07 ±27.75 83.13 2.99 ±1.13
bUnwarpJ full 3.26 ±1.93 3.36 7.11 ±3.49 67.61 ±28.93 68.81 14.01 ±15.34
bUnwarpJ 10k 7.43 ±7.93 4.23 13.53 ±12.24 49.70 ±38.36 54.01 2.66 ±1.17
+ SIFT full 5.90 ±5.36 4.26 12.40 ±12.66 50.97 ±34.74 54.01 11.36 ±11.59
10k 2.50 ±5.11 0.51 6.29 ±7.92 84.25 ±30.46 98.68 1.86 ±0.84
DROP full 2.81 ±3.89 0.89 6.66 ±6.20 61.23 ±43.46 86.51 11.71 ±8.21
10k 3.79 ±2.90 3.47 8.29 ±4.47 79.55 ±16.48 80.77 4.02 ±0.75
Elastix full 4.11 ±2.76 3.80 8.39 ±4.37 70.61 ±16.10 69.74 16.92 ±11.69
10k 3.20 ±1.93 3.22 6.85 ±3.49 68.05 ±29.69 70.20 0.36 ±0.6
RNiftyReg full 3.16 ±1.94 3.27 6.83 ±3.61 69.77 ±29.22 78.14 0.39 ±0.56
10k 5.95 ±15.83 3.24 9.48 ±16.54 57.45 ±28.93 54.12 1.50 ±0.83
RVSS full 4.03 ±2.67 3.61 7.63 ±4.45 55.13 ±22.08 52.42 4.59 ±3.61
Table 4: Aggregated results of all standard image registration methods over both dataset scopes (sizes - 10kpixels in
image diagonal and full WSI microscopy images).
Average-Max-rTRE
Average-Median-rTRE
Median-Median-rTRE
Average-Norm-Time
Average-Weakness
0.06 0.07 0.09 0.1
DROP ANTs bUnwarpJ RNiftyReg Elastix RVSS bUnwarpJ-SIFT
0.023
0.033
0.044
0.054
0.005
0.013
0.02
0.027
0.4
10.7
21.1
31.4
0.16
0.23
0.3
0.36
Figure 2: Summary visualisation of benchmark results over several key metrics for all method on 10kdataset scope.
For all metrics, closer to the centre, better. The weakness measure is inverse robustness.
The best-parameters for each used method was setup based on authors recommendations (e.g. bUnwarpJ, RVSS and
DROP) or experimentally if the first option was not feasible.5
3.4 Results
We have observed a gap between results on 10kand full dataset scope since some methods are not able to work on
some very large images from full scope, typically they fail on pixel indexing or saving larger images then 32k pixels.
The summary results on both dataset scopes are presented in Tab. 4.
5All the used configuration can be found on the BIRL project-page.
7
BIRL: Benchmark on Image Registration methods with Landmark validation A PREPRINT
DROP
ANTs
bUnwarpJ
RNiftyReg
Elastix
RVSS
bUnwarpJ-SIFT
10 3
10 2
10 1
100
MrTRE
DROP
ANTs
bUnwarpJ
RNiftyReg
Elastix
RVSS
bUnwarpJ-SIFT
10 2
10 1
100
SrTRE
DROP
ANTs
bUnwarpJ
RNiftyReg
Elastix
RVSS
bUnwarpJ-SIFT
0.00
0.25
0.50
0.75
1.00
Robust.
DROP
ANTs
bUnwarpJ
RNiftyReg
Elastix
RVSS
bUnwarpJ-SIFT
10 1
100
101
102
Time [min]
Figure 3: Box-plot visualisation of particular measure distribution on dataset scope 10k: (top-left) median rTRE,
(top-right) maximal rTRE, (bottom-left) robustness and (bottom-right) execution time.
ANTs
DROP
Elastix
RNiftyReg
RVSS
bUnwarpJ-SIFT
bUnwarpJ
10 3
10 2
10 1
100
MrTRE
size-10k size-full
ANTs
DROP
Elastix
RNiftyReg
RVSS
bUnwarpJ-SIFT
bUnwarpJ
0.00
0.25
0.50
0.75
1.00
Robust.
size-10k size-full
Figure 4: Distribution comparison between the two dataset scopes - 10kand full size, presented on (left) median
rTRE and (right) robustness.
For the rather technical/implementation limitations, we present the most visual comparison on the 10kdataset scope,
except comparison of the two scopes in Fig. 4.
First, we show the radar chart in Fig.2 aggregating over all major metrics. We can see that most of the performance
curves are parallel, meaning that a particular method usually performs better than its competitors in all aspects.
To show detail inside how the integrated methods perform over the CIMA datasets in the four main metrics: median
and max rTRE per image (denoted as MrTRE and SrTRE respectively), robustness and execution time; the methods
are ordered from left to right by increasing AM rT RE. In the Fig. 3, we can also observe a correlation among all
quality measures (MrTRE, SrTRE and Robust.) for all methods, meaning that DROP has lowest AMrTRE and also
highest robustness.
8
BIRL: Benchmark on Image Registration methods with Landmark validation A PREPRINT
lung-lesion
lung-lobes
mammary-gland
10 1
2 × 10 2
3 × 10 2
4 × 10 2
6 × 10 2
Avg. Median rTRE
lung-lesion
lung-lobes
mammary-gland
0.4
0.6
0.8
Avg. Robust
Figure 5: Performance visualisation of particular methods depending on tissue kind: (left) average median rTRE and
(right) robustness on dataset scope 10k.
Fig. 4 addresses the comparison between the two dataset scopes. Looking at the robustness, we can see a significant
increase of completely failed registrations for ANTs and DROP on while transiting from 10kto full scope compare
to other methods where we can see a less significant quality drop. The MrTRE increase consistently for all methods
while transiting from 10kto full scope.
We have also examined performances depending on the tissue types as they significantly differ in appearance - repet-
itive texture patterns and tissue separability from the background (and size, in particular, for f ull image scope), see
Fig. 5. In this perspective, there is no clear trend for all method’s performances across tissues, for example, Elastix
and DROP perform better on lung lobes compare to other methods.
4 Conclusion
In this report, we briefly introduced the developed image registration framework using dataset with landmark an-
notations and presented the main application use-cases. We described the CIMA histology dataset with some more
information about image sensing, landmarks annotation and image pairing. Later we presented selected standard
image registration methods integrated into BIRL framework and their results on the presented CIMA dataset with
illustrative visualisations.
Hence, any future work uses this as a starting point and can be compared with these result as a baseline.
References
[1] J.B. B Maintz and M.A. A Viergever. A survey of medical image registration. Medical image analysis, 2(1):1–36,
mar 1998. 1
[2] B. Zitová and J. Flusser. Image registration methods: a survey. Image and vision computing, 21(11):977–1000,
2003. 1
[3] J. Salvi, C. Matabosch, D. Fofi, and J. Forest. A review of recent range image registration methods with accuracy
evaluation. Image and Vision Computing, 25(5):578–596, 2007. 1
[4] A. Sotiras, C. Davatzikos, and N. Paragios. Deformable Medical Image Registration: A Survey. Trans Med
Imaging, 32(7):1153–1190, jul 2013. 1
[5] F. Alam, S. Ur. Rahman, M. Hassan, and A. Khalil. An investigation towards issues and challenges in medical
image registration. Journal of Postgraduate Medical Institute, 31(3):224–233, 2017. 1
[6] G. Haskins, U. Kruger, and P. Yan. Deep Learning in Medical Image Registration: A Survey. 2019. 1
[7] F. Alam and S. U. Rahman. Challenges and Solutions in Multimodal Medical Image Subregion Detection and
Registration. Journal of Medical Imaging and Radiation Sciences, 50(1):24–30, 2019. 1
[8] O. Déniz, D. Toomey, C. Conway, G. Bueno, and et al. Multi-stained whole slide image alignment in digital
pathology. In Proc. SPIE Medical Imaging, volume 9420, page 94200Z, 2015. 1
[9] M Gurcan, L E Boucheron, A Can, A Madabhushi, N M Rajpoot, and B Yener. Histopathological image analysis:
A review. IEEE Reviews in Biomedical Engineering, 2:147–171, 2009. 1
9
BIRL: Benchmark on Image Registration methods with Landmark validation A PREPRINT
[10] J. West, J.M. Fitzpatrick, and M.Y. Wang. Comparison and evaluation of retrospective intermodality brain image
registration techniques. Journal of Computer Assisted Tomography, 21(4):554–568, 1997. 1, 2
[11] K. Murphy, B. Van Ginneken, J. M. Reinhardt, and et al. Evaluation of registration methods on thoracic CT: The
EMPIRE10 challenge. IEEE Transactions on Medical Imaging, 30(11):1901–1920, 2011. 1, 2
[12] G. J. Metzger, S. C. Dankbar, J. Henriksen, and et al. Development of multigene expression signature maps at
the protein level from digitized immunohistochemistry slides. PloS One, 7(3):e33520, jan 2012. 1
[13] I. Garcia, G. Mayol, E. Rodríguez, M. Suñol, T. R. Gershon, and et al. Expression of the neuron-specific protein
CHD5 is an independent marker of outcome in neuroblastoma. Molecular Cancer, 9:1–14, 2010. 1
[14] Z. S. Novakovic, M. G. Durdov, L. Puljak, M. Saraga, D. Ljutic, T. Filipovic, and et al. The interstitial expression
of alpha-smooth muscle actin in glomerulonephritis is associated with renal function. Medical Science Monitor,
18(4):235–240, 2012. 1
[15] P. J. Thul and C. Lindskog. The human protein atlas: A spatial map of the human proteome. Protein Science,
27(1):233–244, 2018. 1
[16] M. H. Chin, A. B. Geng, A. H. Khan, and et al. A genome-scale map of expression for a mouse brain section
obtained using voxelation. Physiological Genomics, 30(3):313–321, 2007. 1
[17] X. M. Lopez, P. Barbot, Y. R. Van Eycke, L. Verset, and et al. Registration of whole immunohistochemical slide
images: An efficient way to characterize biomarker colocalization. Journal of the American Medical Informatics
Association, 22(1):86–99, 2014. 1
[18] Y. Song, D. Treanor, A. Bulpitt, and et al. 3D reconstruction of multiple stained histology images. Journal of
Pathology Informatics, 4(2):7, 2013. 1
[19] K. Kartasalo, L. Latonen, T. Visakorpi, P. Nykter, and P. Ruusuvuori. Benchmarking of image registration
methods for 3D tissue reconstruction. In IEEE International Conference on Image Processing,, number 269474,
pages 2–6, 2016. 1
[20] K. Kartasalo, L. Latonen, T. Visakorpi, M. Nykter, and P. Ruusuvuori. Comparative Analysis of Tissue Recon-
struction Algorithms for 3D Histology. Bioinformatics, 34(17):2360–2364, 2018. 1
[21] J. Borovec, J. Kybic, M. Bušta, C. Ortiz-de Solorzano, and A. Munoz-Barrutia. Registration of multiple stained
histological sections. In IEEE International Symposium on Biomedical Imaging (ISBI), pages 1034–1037, San
Francisco, USA, 2013. 1
[22] J. Kybic and J. Borovec. Automatic simultaneous segmentation and fast registration of histological images. In
IEEE International Symposium on Biomedical Imaging (ISBI), pages 774 777, 2014. 1
[23] J. Borovec, A. Munoz-Barrutia, and J. Kybic. Benchmarking of Image Registration Methods for Differently
Stained Histological Slides. In IEEE International Conference on Image Processing (ICIP), pages 3368–3372,
Athens, 2018. 1, 2
[24] R. Castillo, E. Castillo, R. Guerra, and et al. A framework for evaluation of deformable image registration spatial
accuracy using large landmark point sets. Physics in medicine and biology, 54(7):1849–1870, 2009. 2
[25] Y Ou, H Akbari, M Bilello, X Da, and C Davatzikos. Comparative Evaluation of Registration Algorithms
in Different Brain Databases With Varying Difficulty: Results and Insights. IEEE Transactions Med. Imag.,
33(10):2039–2065, 2014. 2
[26] G. E. Christensen, X. Geng, J. G. Kuhl, J. Bruss, and et al. Introduction to the Non-Rigid Image Registration
Evaluation Project. In Lecture Notes in Computer Science, volume 4057, pages 128–135, 2006. 2
[27] A. Klein, J. Andersson, B Ardekani, and et al. Evaluation of 14 nonlinear deformation algorithms applied to
human brain MRI registration. Neuroimage, 46(3):786–802, 2009. 2
[28] K. Marstal, F. Berendsen, N. Dekker, M. Staring, and S. Klein. The continuous registration challenge:
Evaluation-as-a-service for medical image registration algorithms. In Proceedings - International Symposium
on Biomedical Imaging, volume 2019-April, pages 1399–1402. IEEE, 2019. 2
[29] J. Borovec, J. Kybic, A. Munoz-Barrutia, and et al. Automatic Non-rigid Histological Image Registration chal-
lenge. In IEEE International Symposium on Biomedical Imaging (ISBI), Venice, Italy, 2019. 2
[30] K. K. Brock. Results of a Multi-Institution Deformable Registration Accuracy Study (MIDRAS). International
Journal of Radiation Oncology, Biology, Physics, 76(2):583–596, 2010. 2
[31] B. B. Avants, C. L. Epstein, M. Grossman, and J. C. Gee. Symmetric diffeomorphic image registration with cross-
correlation: Evaluating automated labeling of elderly and neurodegenerative brain. Medical Image Analysis,
12(1):26–41, 2008. 4
10
BIRL: Benchmark on Image Registration methods with Landmark validation A PREPRINT
[32] I. Arganda-Carreras, C. Sorzano, R. Marabini, and et al. Consistent and elastic registration of histological sections
using vector-spline regularization. In Computer Vision Approaches to Medical Image Analysis, volume 4241,
pages 85—-95, 2006. 5
[33] J. Schindelin, I. Arganda-Carreras, E. Frise, and et al. Fiji: an open-source platform for biological-image analysis.
Nature Methods, 9(7):676–682, 2012. 5
[34] B. Glocker, N. Komodakis, G. Tziritas, and N. Navab. Dense image registration through MRFs and efficient
linear programming. Medical Image Analysis, 12(6):731–741, 2008. 5
[35] B. Glocker, A. Sotiras, N. Komodakis, N. Paragios, and Al. Deformable Medical Image Registration: Setting the
State of the Art with Discrete Methods. Annual Review of Biomedical Engineering, 13(1):219–244, 2011. 5
[36] S. Klein, M. Staring, and K. Murphy. Elastix: a toolbox for intensity-based medical image registration. Medical
Imaging, IEEE, 29(1), 2010. 5
[37] S. Ourselin, A. Roche, and G. Subsol. Reconstructing a 3D structure from serial histological sections. Image
and Vision Computing, 19(1-2):25–31, 2001. 5
[38] M. Modat, D. Cash, P. Daga, G. Winston, and et al. A symmetric block-matching framework for global registra-
tion. volume 9034, page 90341D, 03 2014. 5
[39] D. Lowe. Distinctive image features from scale-invariant keypoints. 60(2):91–110, 2004. 5
[40] R. Fernandez-Gonzalez, A. Jones, E. Garcia-Rodriguez, P.Y. Chen, A Idica, S.J. Lockett, M.H. Barcellos-Hoff,
and C. Ortiz de Solórzano. System for combined three-dimensional morphological and molecular analysis of
thick tissue specimens. Microscopy Research & Techniques, (59):522–530, 2002. 5
11
ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
The establishment of image correspondence through robust image registration is critical to many clinical tasks such as image fusion, organ atlas creation, and tumor growth monitoring and is a very challenging problem. Since the beginning of the recent deep learning renaissance, the medical imaging research community has developed deep learning-based approaches and achieved the state-of-the-art in many applications, including image registration. The rapid adoption of deep learning for image registration applications over the past few years necessitates a comprehensive summary and outlook, which is the main scope of this survey. This requires placing a focus on the different research areas as well as highlighting challenges that practitioners face. This survey, therefore, outlines the evolution of deep learning-based medical image registration in the context of both research challenges and relevant innovations in the past few years. Further, this survey highlights future research directions to show how this field may be possibly moved forward to the next level.
Article
Full-text available
Background: The automatic detection of common subregions and registration in multimodal functional and structural images is challenging. This article gives an overview of multimodal image registration and the developments and technical issues with automatic detection and registration of subregions of interest in multimodal images. Discussion: The available knowledge about subregion detection and registration in multimodal images are described in detail. Besides the provision of compact knowledge on subregion detection and registration, the challenges and proposed solutions are also discussed. Conclusion: This article provides research guidelines for the development of automatic detection and registration of subregions of interest in functional and structural images with high accuracy and efficiency.
Conference Paper
Full-text available
Image registration is a common task for many biomedical analysis applications. The present work focuses on the benchmarking of registration methods on differently stained histological slides. This is a challenging task due to the differences in the appearance model, the repetitive texture of the details and the large image size, between other issues. Our benchmarking data is composed of 616 image pairs at two different scales - average image diagonal 2.4k and 5k pixels. We compare eleven fully automatic registration methods covering the widely used similarity measures (and optimization strategies with both linear and elastic transformation). For each method, the best parameter configuration is found and subsequently applied to all the image pairs. The performance of the algorithms is evaluated from several perspectives - the registrations (in)accuracy on manually annotated landmarks, the method robustness and its processing computation time.
Article
Full-text available
Motivation: Digital pathology enables new approaches that expand beyond storage, visualization or analysis of histological samples in digital format. One novel opportunity is 3D histology, where a three-dimensional reconstruction of the sample is formed computationally based on serial tissue sections. This allows examining tissue architecture in 3D, for example, for diagnostic purposes. Importantly, 3D histology enables joint mapping of cellular morphology with spatially resolved omics data in the true 3D context of the tissue at microscopic resolution. Several algorithms have been proposed for the reconstruction task, but a quantitative comparison of their accuracy is lacking. Results: We developed a benchmarking framework to evaluate the accuracy of several free and commercial 3D reconstruction methods using two whole slide image datasets. The results provide a solid basis for further development and application of 3D histology algorithms and indicate that methods capable of compensating for local tissue deformation are superior to simpler approaches. Availability: Code: https://github.com/BioimageInformaticsTampere/RegBenchmark. Whole slide image datasets: http://urn.fi/urn:nbn:fi:csc-kata20170705131652639702. Contact: pekka.ruusuvuori@tut.fi. Supplementary information: Supplementary data are available at Bioinformatics online.
Article
Full-text available
The continuous development and innovation in medical imaging techniques provide clinicians new ways for improved health care services. Despite improvement in health care services, several issues and challenges in medical image analysis are still present. Image registration is one of the most important tasks in medical image analysis and is the most critical step in several clinical applications. In this paper, medical image registration, which effectively integrate complementary and valuable information from multiple imaging resources and represent them in a single more informative image, is introduced. This paper covers the most prominent state-of-the-art issues and challenges in medical image registration and suggests some possible solutions. Moreover, the factors affecting the accuracy, reliability and efficiency of registration techniques are presented. An improved health care service is difficult to achieve until all the issues and challenges in medical image registration are identified and subsequently solved.
Conference Paper
Full-text available
Most registration algorithms suffer from a directionality bias that has been shown to largely impact on subsequent analyses. Several approaches have been proposed in the literature to address this bias in the context of non-linear registration but little work has been done in the context of global registration. We propose a symmetric approach based on a block-matching technique and least trimmed square regression. The proposed method is suitable for multi-modal registration and is robust to outliers in the input images. The symmetric framework is compared to the original asymmetric block-matching technique, outperforming it in terms accuracy and robustness.
Article
The correct spatial distribution of proteins is vital for their function and often mis-localization or ectopic expression leads to diseases. For more than a decade, the Human Protein Atlas (HPA) has constituted a valuable tool for researchers studying protein localization and expression in human tissues and cells. The centerpiece of the HPA is its unique antibody collection for mapping the entire human proteome by immunohistochemistry and immunocytochemistry. By these approaches, more than 10 million images showing protein expression patterns at a single-cell level were generated and are publicly available at www.proteinatlas.org. The antibody-based approach is combined with transcriptomics data for an overview of global expression profiles. The present article comprehensively describes the HPA database functions and how users can utilize it for their own research as well as discusses the future path of spatial proteomics. This article is protected by copyright. All rights reserved.
Article
In Digital Pathology, one of the most simple and yet most useful feature is the ability to view serial sections of tissue simultaneously on a computer monitor. This enables the pathologist to evaluate the histology and expression of multiple markers for a patient in a single review. However, the rate limiting step in this process is the time taken for the pathologist to open each individual image, align the sections within the viewer, with a maximum of four slides at a time, and then manually move around the section. In addition, due to tissue processing and pre-analytical steps, sections with different stains have non-linear variations between the two acquisitions, that is, they will stretch and change shape from section to section. To date, no solution has come close to a workable solution to automatically align the serial sections into one composite image. This research work address this problem to obtain an automated serial section alignment tool enabling the pathologists to simply scroll through the various sections in a single viewer. To this aim a multiresolution intensity-based registration method using mutual information as a similarity metric, an optimizer based on an evolutionary process and a bilinear transformation has been used. To characterize the performance of the algorithm 40 cases x 5 different serial sections stained with hematoxiline-eosine (HE), estrogen receptor (ER), progesterone receptor (PR), Ki67 and human epidermal growth factor receptor 2 (Her2), have been considered. The qualitative results obtained are promising, with average computation time of 26.4s for up to 14660x5799 images running interpreted code.